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A DIAGNOSTIC ASSAY FOR THE HUMAN VIRUS 

CAUSING 

SEVERE ACUTE RESPIRATORY SYNDROME (SARS) 



This application claims priority benefit to U.S. provisional application no. 
60/457,031, filed March 24, 2003; U.S. provisional application no. 60/457,730, filed 
March 26, 2003; U.S. provisional application no. 60/459,931, filed April 2, 2003; U.S. 
5 provisional application no. 60/460,357, filed April 3, 2003; U.S. provisional application 
no. 60/461,265, filed April 8, 2003; U.S. provisional application no. 60/462,805, filed 
April 14, 2003; U.S. provisional application no. 60/464,886 filed April 23, 2003, U.S. 
provisional application no. 60/468,139, filed May 5, 2003; and U.S. provisional 
application no. 60/471,200, filed May 16, 2003, each of which is incorporated herein by 
10 reference in its entirety. 

The instant application contains a lengthy Sequence Listing which is being 
concurrently submitted via triplicate CD-R in lieu of a printed paper copy, and is hereby 
incorporated by reference in its entirety. Said CD-R, recorded on March 22, 2004, are 
labeled "CRF", "Copy 1" and "Copy 2", respectively, and each contains only one 
15 identical 1.58 MB file (V9661078.APP). 

1. FIELD OF THE INVENTION 

The present invention relates to a diagnostic assay for the virus causing Severe 
Acute Respiratory Syndrome (SARS) in humans ("hSARS virus"). In particular, the 
invention relates to a quantitative assay for the detection of the hS ARS virus, natural or 
20 artificial variants, analogs, or derivatives thereof, using reverse transcription and 

polymerase chain reaction (RT-PCR). Specifically, the quantitative assay is a TaqMan® 
assay. The invention further relates to a diagnostic kit that comprises nucleic acid 
molecules for the detection of the hSARS virus. 

2. BACKGROUND 

25 Recently, there has been an outbreak of atypical pneumonia in Guangdong 

province in mainland China. Between November 2002 and March 2003, there were 792 
reported cases with 3 1 fatalities (WHO. Severe Acute Respiratory Syndrome (SARS) 
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Weekly Epidemiol Rec. 2003; 78: 86). In response to this crisis, the Hospital Authority in 
Hong Kong has increased the surveillance on patients with severe atypical pneumonia. In 
the course of this investigation, a number of clusters of health care workers with the 
disease were identified. In addition, there were clusters of pneumonia incidents among 
5 persons in close contact with those infected. The disease was unusual in its severity and 
its progression in spite of the antibiotic treatment typical for the bacterial pathogens that 
are known to be commonly associated with atypical pneumonia. The present inventors 
were one of the groups involved in the investigation of these patients. All tests for 
identifying commonly recognized viruses and bacteria were negative in these patients. 
10 The disease was given the acronym Severe Acute Respiratory Syndrome ("SARS"). The 
etiologic agent responsible for this disease was not known until the isolation of hSARS 
virus from the SARS patients by the present inventors. The present invention provides a 
rapid and specific real-time quantitative PGR assay as disclosed herein. The invention is 
useful in both clinical and scientific research applications. 

15 3. SUMMARY OF THE INVENTION 

The invention relates to the use of the sequence information of isolated hS ARS 
virus for diagnostic methods. In a preferred embodiment, the isolated hSARS virus was 
deposited in Genbank, NCBI with Accession No: AY278491 (SEQ ID NO: 15), which is 
incorporated herein by reference. The isolated hSARS virus was deposited with the 

20 China Center for Type Culture Collection (CCTCC) on April 2, 2003 and accorded an 
accession number, CCTCC- V2003 03, as described in Section 7, infra, which is 
incorporated by reference. 

In a specific embodiment, the invention provides a diagnostic assay for the 
hS ARS virus, natural or artificial variants, analogs, or derivatives thereof. In particular, 

25 the invention relates to a quantitative assay for the detection of nucleic acid molecules of 
hSARS virus using reverse transcription and polymerase chain reaction (RT-PCR). 
Specifically, the quantitative assay is a TaqMan® assay. Also provided in the present 
invention are nucleic acid molecules that are suitable for hybridization to hSARS nucleic 
acids such as, including, but not limited to, PGR primers, Reverse Transcriptase primers, 

30 probes for Southern analysis or other nucleic acid hybridization analysis for the detection 
of hSARS nucleic acids. Said hSARS nucleic acids consist of or comprise the nucleic 
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acid sequence of SEQ IDNO:l, 11, 13, 15, 16, 240, 737, 1108, 1590, 1965, 2471, 2472, 
2473, 2474, 2475 or 2476, or a complement, analog, derivative, or fragment thereof, or a 
portion thereof. In a preferred embodiment, the primers comprise the nucleic acid 
sequence of SEQ ID NOS:2471 and/or 2472. In a preferred embodiment, the primers 
5 comprise the nucleic acid sequence of SEQ ID NOS:2474 and/or 2475. In a most 

preferred embodiment, the nucleic acid molecule comprises the nucleic acid sequence of 
SEQ ID NO: 2473, or a portion thereof, and may be used for the detection of the hSARS 
virus in a RT-PCR assay using nucleic acid molecules comprising the nucleic acid 
sequences of SEQ ID NOS:2471 and/or 2472 as primers. In another most preferred 

10 embodiment, the nucleic acid molecule comprises the nucleic acid sequence of SEQ ID 
NO: 2476, or a portion thereof, and may be used for the detection of the hSARS virus in a 
RT-PCR assay using nucleic acid molecules comprising the nucleic acid sequences of 
SEQ ID NOS:2474 and/or 2475 as primers. In yet another most preferred embodiment, 
the assay is a TaqMan® quantitative assay. 

15 In one embodiment, the invention provides methods for detecting the presence or 

expression of the hSARS virus, natural or artificial variants, analogs, or derivatives 
thereof, in a biological material, such as cells, blood, serum, plasma, saliva, urine, stool, 
sputum, nasopharyngeal aspirates, and so forth. The increased or decreased activity or 
expression of the hSARS virus in a sample relative to a control sample can be determined 

20 by contacting the biological material with an agent which can detect directly or indirectly 
the presence or expression of the hSARS virus. In a specific embodiment, the detecting 
agents are nucleic acid molecules of the present invention. In another specific 
embodiment, the detecting nucleic acid molecules are immobilized on a DNA microarray 
chip. 

25 In a specific embodiment, the invention provides a diagnostic kit comprising 

nucleic acid molecules which are suitable for use to detect the hS ARS virus, natural or 
artificial variants, analogs, or derivatives thereof. In a specific embodiment, the nucleic 
acid molecules have the nucleic acid sequence of SEQ ID NOS:2471 and/or 2472. In 
specific embodiments, the nucleic acid molecule has the nucleic acid sequence of SEQ ID 

30 NO: 2473. In another specific embodiment, the nucleic acid molecules have the nucleic 
acid sequence of SEQ ID NO S: 2474 and/or 2475. In specific embodiments, the nucleic 
acid molecule has the nucleic acid sequence of SEQ ID NO:2476. 
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In one aspect, the invention relates to the use of the isolated hSARS virus for 
diagnostic methods. In a specific embodiment, the invention provides a method of 
detecting mKNA or genomic RNA of the hSARS virus of the invention in a biological 
material, such as cells, blood, serum, plasma, saliva, urine, stool, sputum, nasopharyngeal 
5 aspirates, and so forth. The increased or decreased level of mKNA or genomic RNA of 
the hSARS virus in a sample relative to a control sample can be determined by contacting 
the biological material with an agent which can detect directly or indirectly the rnRNA or 
genomic RNA of the hSARS virus. In a specific embodiment, the detecting agents are 
the nucleic acid molecules of the present invention. In another specific embodiment, the 

10 detecting nucleic acid molecules are immobilized on a DNA micro array chip. 

In another aspect, the invention relates to the use of the isolated hS ARS virus for 
diagnostic methods, such as detecting an antibody, which immuno specifically binds to 
the hSARS virus, in a biological sample. In a specific embodiment, the detecting agents 
are a hSARS virus, for example, of deposit no. CCTCC- V2003 03 , or having a genomic 

15 nucleic acid sequence of SEQ ID NO: 15, or polypeptides encoded by the nucleic acid 

sequence of SEQ ID NO: 1, 11, 13, 15, 16, 240, 737, 1108, 1590, 1965, 2471, 2472, 2473, 
2474, 2475 or 2476. 

In yet another aspect, the invention provides antibodies or antigen-binding 
fragments thereof which immunospecifically bind a polypeptide of the invention encoded 

20 by the nucleotide sequence of SEQ ID NO: 1, 11, 13, 15, 16,240,737, 1108, 1590, 1965, 
2471, 2472, 2473, 2474, 2475 or 2476, or encoded by a nucleic acid comprising a 
nucleotide sequence that hybridizes under stringent conditions to the nucleotide sequence 
ofSEQIDNO:l, 11, 13, 15, 16, 240, 737, 1108, 1590, 1965, 2471,2472, 2473,2474, 
2475 or 2476, and/or any hSARS epitope, having one or more biological activities of a 

25 polypeptide of the invention. Such antibodies include, but are not limited to polyclonal, 
monoclonal, bi-specific, multi-specific, human, humanized, chimeric antibodies, single 
chain antibodies, Fab fragments, F(ab ? ) 2 fragements, disulfide-linked Fvs, intrabodies and 
fragments containing either a VL or VH domain or even a complementary determining 
region (CDR) that specifically binds to a polypeptide of the invention. 

30 The present invention also relates to a method of identifying a subject infected 

with the hS ARS virus, natural or artificial variants, analogs, or derivatives thereof. In a 
specific embodiment, the method comprises obtaining total RNA from a biological 
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sample obtained from the subject; reverse transcribing the total RNA to obtain cDNA; 
and subjecting the cDNA to PCR assay using a set of primers derived from a nucleotide 
sequence of the hSARS virus. 

The present invention further relates to a diagnostic kit comprising primers and a 
5 nucleic acid probe for the detection of mRNA or genomic RNA of hSARS virus. 

3.1. Definitions 

As used herein, the term "variant" refers either to a naturally occurring genetic 
mutant of the hS ARS virus or a recombinantly prepared variation of the hSARS virus, 
each of which contain one or more mutations in its genome compared to the hSARS virus 

10 of CCTCC-V200303 . The term "variant" may also refer to either a naturally occurring 
variation of a given peptide or a recombinantly prepared variation of a given peptide or 
protein in which one or more amino acid residues have been modified by amino acid 
substitution, addition, or deletion. 

As used herein, the term "analogue" in the context of a non-proteinaceous analog 

15 refers to a second organic or inorganic molecule which possess a similar or identical 
function as a first organic or inorganic molecule and is structurally similar to the first 
organic or inorganic molecule. 

As used herein, the term "derivative" in the context of a non-proteinaceous 
derivative refers to a second organic or inorganic molecule that is formed based upon the 

20 structure of a first organic or inorganic molecule. A derivative of an organic molecule 
includes, but is not limited to, a molecule modified, e.g., by the addition or deletion of a 
hydroxyl, methyl, ethyl, carboxyl or amine group. An organic molecule may also be 
esterified, alkylated and/or phosphorylated. 

As used herein, the term "mutant" refers to the presence of mutations in the 

25 nucleotide sequence of an organism as compared to a wild-type organism. 

As used herein, the terms "antibody" and "antibodies" refer to monoclonal 
antibodies, bispecific antibodies, multispecific antibodies, human antibodies, humanized 
antibodies, chimeric antibodies, camelised antibodies, single domain antibodies, single- 
chain Fvs (scFv), single chain antibodies, Fab fragments, F(ab') fragments, disulfide- 

30 linked Fvs (sdFv), and anti-idiotypic (anti-Id) antibodies (including, e.g., anti-Id 

antibodies to antibodies of the invention), and epitope-binding fragments of any of the 
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above. In particular, antibodies include immunoglobulin molecules and immunologically 
active fragments of immunoglobulin molecules, i.e., molecules that contain an antigen 
binding site. Immunoglobulin molecules can be of any type (e.g., IgG, IgE, IgM, IgD, 
IgA and IgY), class (e.g., IgGl, IgG2, IgG3, IgG4, IgAl and IgA2), or subclass. 
5 As used herein, the term "antibody fragment" refers to a fragment of an antibody 

that immunospecifically binds to an hSARS virus or any epitope of the hSARS virus. 
Antibody fragments may be generated by any technique known to one of skill in the art. 
For example, Fab and F(ab') 2 fragments may be produced by proteolytic cleavage of 
immunoglobulin molecules, using enzymes such as papain (to produce Fab fragments) or 
10 pepsin (to produce F(ab') 2 fragments). F(ab') 2 fragments contain the complete light chain, 
and the variable region, the CHI region and the hinge region of the heavy chain. 
Antibody fragments can be also produced by recombinant DNA technologies. Antibody 
fragments may be one or more complementarity determining regions (CDRs) of 
antibodies. 

15 As used herein, the term "an antibody or an antibody fragment that 

immunospecifically binds a polypeptide of the invention" refers to an antibody or a 
fragment thereof that immunospecifically binds to the polypeptide encoded by the nucleic 
acid sequence of SEQ ID NO: 1, 11, 13, 15, 16, 240, 737, 1108, 1590, 1965, 2471, 2472, 
2473, 2474, 2475 or 2476, or a complement, analog, derivative, or fragment thereof, or a 

20 portion thereof, or that immunospecifically binds to the polypeptide having the amino 
acid sequence of SEQ IDNO:2, 12, 14, 17-239, 241-736, 738-1107, 1109-1589, 1591- 
1964 or 1966-2470, or a variant, analog, derivative, or fragment thereof, and does not 
non- specifically bind to other polypeptides. An antibody or a fragment thereof that 
immunospecifically binds to the polypeptide of the invention may cross-react with other 

25 antigens. Preferably, an antibody or a fragment thereof that immunospecifically binds to 
a polypeptide of the invention does not cross-react with other antigens. An antibody or a 
fragment thereof that immunospecifically binds to the polypeptide of the invention, can 
be identified by, for example, immunoassays or other techniques known to those skilled 
in the art. 

30 As used herein, the term "epitope" refers to a fragment of an hSARS virus, 

polypeptide or protein having antigenic or immunogenic activity in an animal, preferably 
a mammal, and most preferably in a human. An epitope having immunogenic activity is 
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a fragment of a polypeptide that elicits an antibody response in an animal. An epitope 
having antigenic activity is a fragment of a polypeptide or protein to which an antibody 
immunospecifically binds as determined by any method well known in the art, for 
example, by the immunoassays described herein. Antigenic epitopes need not necessarily 
5 be immunogenic. 

As used herein, the term "antigenicity" refers to the ability of a substance {e.g., 
foreign objects, microorganisms, drugs, antigens, proteins, peptides, polypeptides, 
nucleic acids, DNA, RNA, etc.) to trigger an immune response in a particular organism, 
tissue, and/or cell. Sometimes, the term "antigenic" is synonymous with the term 

1 0 "immunogenic". 

As used herein, the term "immunogenicity" refers to the property of a substance 
{e.g., foreign objects, microorganisms, drugs, antigens, proteins, peptides, polypeptides, 
nucleic acids, DNA, RNA, etc.) being able to evoke an immune response within an 
organism. Immunogenicity depends partly upon the size of the substance in question and 

15 partly upon how unlike the host molecules is the substance. Highly conserved proteins 
tend to have rather low immunogenicity. 

An "isolated" nucleic acid molecule is one which is separated from other nucleic 
acid molecules which are present in the natural source of the nucleic acid molecule. 
Moreover, an "isolated" nucleic acid molecule, such as a cDNA molecule, can be 

20 substantially free of other cellular material, or culture medium when produced by 

recombinant techniques, or substantially free of chemical precursors or other chemicals 
when chemically synthesized. In a preferred embodiment of the invention, nucleic acid 
molecules encoding polypeptides/proteins of the invention are isolated or purified. The 
term "isolated" nucleic acid molecule does not include a nucleic acid that is a member of 

25 a library that has not been purified away from other library clones containing other 
nucleic acid molecules. 

As used herein, the term "hybridizes under stringent conditions" describes 
conditions for hybridization and washing under which nucleotide sequences having at 
least 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95% 

30 identity to each other typically remain hybridized to each other. Such hybridization 
conditions are described in, for example but not limited to, Current Protocols in 
Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1^6.3.6.; Basic Methods in 
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Molecular Biology, Elsevier Science Publishing Co., Inc., N.Y. (1986), pp.75-78, and 84- 
87; and Molecular Cloning, Cold Spring Harbor Laboratory, N.Y. (1982), pp.387-389, 
and are well known to those skilled in the art. A preferred, non-limiting example of 
stringent hybridization conditions is hybridization in 6X sodium chloride/sodium citrate 
5 (SSC), 0.5% SDS at about 68°C followed by one or more washes in 2X SSC, 0.5% SDS 
at room temperature. Another preferred, non-limiting example of stringent hybridization 
conditions is hybridization in 6X SSC at about 45°C followed by one or more washes in 
0.2X SSC, 0.1% SDS at about 50°C to 65°C. 

An "isolated" or "purified" peptide or protein is substantially free of cellular 

10 material or other contaminating proteins from the cell or tissue source from which the 

protein is derived, or is substantially free of chemical precursors or other chemicals when 
chemically synthesized. The language "substantially free of cellular material" includes 
preparations of a polypeptide/protein in which the polyp eptide/protein is separated from 
cellular components of the cells from which it is isolated or recombinantly produced. 

15 Thus, a polypeptide/protein that is substantially free of cellular material includes 

preparations of the polypeptide/protein having less than about 30%, 20%, 10%, 5%, 2.5%, 
or 1%, (by dry weight) of contaminating protein. When the polypeptide/protein is 
recombinantly produced, it is also preferably substantially free of culture medium, i.e., 
culture medium represents less than about 20%, 10%, or 5% of the volume of the protein 

20 preparation. When polypeptide/protein is produced by chemical synthesis, it is 

preferably substantially free of chemical precursors or other chemicals, i.e., it is separated 
from, chemical precursors or other chemicals which are involved in the synthesis of the 
protein. Accordingly, such preparations of the polypeptide/protein have less than about 
30%, 20%, 10%, 5% (by dry weight) of chemical precursors or compounds other than the 

25 polypeptide/protein fragment of interest. In a preferred embodiment of the present 
invention, the polyp eptides/proteins are isolated or purified. 

As used herein, the term "isolated" virus is one which is separated from other 
organisms which are present in the natural source of the virus, e.g., biological material 
such as cells, blood, serum, plasma, saliva, urine, stool, sputum, nasopharyngeal aspirates, 

30 and so forth. The isolated virus can be used to infect a subject. 

As used herein, the term "having a biological activity of the polypeptides of the 
invention" refers to the characteristics of the polypeptides or proteins having a common 
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biological activity similar or identical structural domain and/or having sufficient amino 
acid identity to the polypeptide encoded by the nucleotide sequence of SEQ ID NO: 1, 1 1, 
13, 15, 16, 240, 737/1108, 1590, 1965, 2471, 2472, 2473, 2474, 2475 or 2476, or a 
complement, analog, derivative, or fragment thereof, or a portion thereof, or the 
5 polypeptide having the amino acid sequence of SEQ ID NO: 2, 12, 14, 17-239, 241-736, 
738-1107, 1109-1589, 1591-1964 or 1966-2470, or a variant, analog, derivative, or 
fragment thereof. Such common biological activities of the polypeptides of the invention 
include antigenicity and immunogenicity. 

As used herein, the term "portion" or "fragment" refers to a fragment of a nucleic 

10 acid molecule containing at least about 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 150, 
200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 
1050, 1100, 1150, 1200, 2,000, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 9,000, 10,000, 
11,000, 12,000, 13,000, 14,000, 15,000, 16,000, 17,000, 18,000, 19,000, 20,000, 21,000, 
22,000, 23,000, 24,000, 25,000, 26,000, 27,000, 28,000, 29,000 or more contiguous 

15 nucleic acids in length of the relevant nucleic acid molecule and having at least one 

functional feature of the nucleic acid molecule (or the encoded protein has one functional 
feature of the protein encoded by the nucleic acid molecule); or a fragment of a protein or 
a polypeptide containing at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 
80, 90, 100, 120, 140, 160, 180, 200, 220, 240, 260, 280, 300, 320, 340, 360, 380, 400, 

20 500, 600, 800, 1,000, 2,000, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 9,000, 9,500 or 
more amino acid residues in length of the relevant protein or polypeptide and having at 
least one functional feature of the protein or polypeptide. 

As used herein, the term "analogue" in the context of proteinaceous agent (e.g., 
proteins, polypeptides, peptides, and antibodies) refers to a proteinaceous agent that 

25 possesses a similar or identical function as a second proteinaceous agent but does not 
necessarily comprise a similar or identical amino acid sequence of the second 
proteinaceous agent, or possess a similar or identical structure of the second 
proteinaceous agent. In a specific embodiment, antibody analogues immunospecifically 
bind to the same epitope as the original antibodies from which the analogues were 

30 derived. In an alternative embodiment, antibody analogues immunospecifically bind to 
different epitopes than the original antibodies from which the analogues were derived. A 
proteinaceous agent that has a similar amino acid sequence refers to a second 
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proteinaceous agent that satisfies at least one of the following; (a) a proteinaceous agent 
having an amino acid sequence that is at least 30%, at least 35%, at least 40%, at least 
45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at 
least 80%, at least 85%, at least 90%, at least 95% or at least 99% identical to the amino 
5 acid sequence of a second proteinaceous agent; (b) a proteinaceous agent encoded by a 
nucleotide sequence that hybridizes under stringent conditions to a nucleotide sequence 
encoding a second proteinaceous agent of at least 5 contiguous amino acid residues, at 
least 10 contiguous amino acid residues, at least 15 contiguous amino acid residues, at 
least 20 contiguous amino acid residues, at least 25 contiguous amino acid residues, at 

10 least 40 contiguous amino acid residues, at least 50 contiguous amino acid residues, at 

least 60 contiguous amino residues, at least 70 contiguous amino acid residues, at least 80 
contiguous amino acid residues, at least 90 contiguous amino acid residues, at least 100 
contiguous amino acid residues, at least 125 contiguous amino acid residues, or at least 
150 contiguous amino acid residues; and (c) a proteinaceous agent encoded by a 

15 nucleotide sequence that is at least 30%, at least 35%, at least 40%, at least 45%, at least 
50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at 
least 85%, at least 90%, at least 95% or at least 99% identical to the nucleotide sequence 
encoding a second proteinaceous agent. A proteinaceous agent with similar structure to a 
second proteinaceous agent refers to a proteinaceous agent that has a similar secondary, 

20 tertiary or quaternary structure to the second proteinaceous agent. The structure of a 
proteinaceous agent can be determined by methods known to those skilled in the art, 
including but not limited to, peptide sequencing, X ray crystallography, nuclear magnetic 
resonance, circular dichroism, and crystallographic electron microscopy. 

To determine the percent identity of two amino acid sequences or of two nucleic 

25 acid sequences, the sequences are aligned for optimal comparison purposes {e.g., gaps 
can be introduced in the sequence of a first amino acid or nucleic acid sequence for 
optimal alignment with a second amino acid or nucleic acid sequence). The amino acid 
residues or nucleotides at corresponding amino acid positions or nucleotide positions are 
then compared. When a position in the first sequence is occupied by the same amino acid 

30 residue or nucleotide as the corresponding position in the second sequence, then the 

molecules are identical at that position. The percent identity between the two sequences 
is a function of the number of identical positions shared by the sequences {i.e., % identity 
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= number of identical overlapping positions/total number of positions x 100%), In one 
embodiment, the two sequences are the same length. 

The determination of percent identity between two sequences can also be 
accomplished using a mathematical algorithm. A preferred, non limiting example of a 
5 mathematical algorithm utilized for the comparison of two sequences is the algorithm of 
Karlin and Altschul, 1990, Proc. Natl. Acad. Sci. U.S.A. 87:2264 2268, modified as in 
Karlin and Altschul, 1993, Proc. Natl. Acad. Sci. U.S.A. 90:5873 5877. Such an 
algorithm is incorporated into the NBLAST and XBLAST programs of Altschul et al., 
1990, J. Mol. Biol. 215:403. BLAST nucleotide searches can be performed with the 

10 NBLAST nucleotide program parameters set, e.g., for score=100, wordlength=12 to 
obtain nucleotide sequences homologous to a nucleic acid molecules of the present 
invention. BLAST protein searches can be performed with the XBLAST program 
parameters set, e.g., to score 50, wordlength = 3 to obtain amino acid sequences 
homologous to a protein molecule of the present invention. To obtain gapped alignments 

1 5 for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al., 
1997, Nucleic Acids Res. 25:3389 3402. Alternatively, PSI BLAST can be used to 
perform an iterated search which detects distant relationships between molecules (Id.). 
When utilizing BLAST, Gapped BLAST, and PSI Blast programs, the default parameters 
of the respective programs (e.g., of XBLAST and NBLAST) can be used (see, e.g., the 

20 NCBI website). Another preferred, non limiting example of a mathematical algorithm 
utilized for the comparison of sequences is the algorithm of Myers and Miller, 1988, 
CABIOS 4:11 17. Such an algorithm is incorporated in the ALIGN program (version 2.0) 
which is part of the GCG sequence alignment software package. When utilizing the 
ALIGN program for comparing amino acid sequences, a PAM120 weight residue table, a 

25 gap length penalty of 12, and a gap penalty of 4 can be used. 

The percent identity between two sequences can be determined using techniques 
similar to those described above, with or without allowing gaps. In calculating percent 
identity, typically only exact matches are counted. 

As used herein, the term "derivative" in the context of proteinaceous agent (e.g., 

30 proteins, polypeptides, peptides, and antibodies) refers to a proteinaceous agent that 

comprises an amino acid sequence which has been altered by the introduction of amino 
acid residue substitutions, deletions, and/or additions. The term "derivative" as used 
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herein also refers to a proteinaceous agent which has been modified, i.e., by the covalent 
attachment of any type of molecule to the proteinaceous agent. For example, but not by 
way of limitation, an antibody may be modified, e.g., by glycosylation, acetylation, 
pegylation, phosphorylation, amidation, derivatization by known protecting/blocking 
5 groups, proteolytic cleavage, linkage to a cellular ligand or other protein, etc. A 

derivative of a proteinaceous agent may be produced by chemical modifications using 
techniques known to those of skill in the art, including, but not limited to specific 
chemical cleavage, acetylation, formylation, metabolic synthesis of tunicamycin, etc. 
Further, a derivative of a proteinaceous agent may contain one or more non-classical 

10 amino acids. A derivative of a proteinaceous agent possesses a similar or identical 
function as the proteinaceous agent from which it was derived. 

As used herein, the terms "subject" and "patient" are used interchangeably. As 
used herein, the terms "subject" and "subjects" refer to an animal, preferably a mammal 
including a non-primate (e.g., cows, pigs, horses, goats, sheep, cats, dogs, avian species 

15 and rodents) and a non-primate (e.g., monkeys such as a cynomolgous monkey and 
humans), and more preferably a human. 

4. DESCRIPTIONS OF THE FIGURES 

Figure 1 shows a partial DNA sequence (SEQ ID NO:l) and its deduced amino 
acid sequence (SEQ ID NO:2) obtained from the SARS virus that has 57% homology to 
20 the KNA-dependent RNA polymerase protein of known Coronaviruses. 

Figure 2 shows an electron micrograph of the novel hS ARS virus that has similar 
morphological characteristics of coronaviruses. 

Figure 3 shows an immunofluorescent staining for IgG antibodies that are bound 
to the FrHK-4 cells infected with the novel human respiratory virus of Coronaviridae . 
25 Figure 4 shows an electron micrograph of ultra-centrifuged deposit of hSARS 

virus that was grown in the cell culture and negatively stained with 3% potassium 
phospho-tungstate at pH 7.0. 

Figure 5 A shows a thin-section electron micrograph of lung biopsy of a patient 
with SARS; Figure 5B shows a thin section electron micrograph of hSARS virus-infected 
30 cells. 
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Figure 6 shows the result of phylogenetic analysis for the partial protein sequence 
(215 amino acids; SEQ ID NO:2) of the hSARS virus (GenBank accession number 
AY268070). The phylogenetic tree is constructed by the neighbor-jointing method. The 
horizontal-line distance represents the number of sites at which the two sequences 
5 compared are different. Bootstrap values are deducted from 500 replicates. 

Figure 7 A shows an amplification plot of fluorescence intensity against the PGR 
cycle in a real-time quantitative PGR assay that can detect an hSARS virus in samples 
quantitatively. The copy numbers of input plasmid DNA in the reactions are indicated. 
The X-axis denotes the cycle number of a quantitative PGR assay and the Y-axis denotes 
10 the fluorescence intensity (FI) over the backgroud. Figure 7B shows the result of a 
melting curve analysis of PCR products from clinical samples. Signals from positive 
(+ve) samples, negative (-ve) samples and water control (water) are indicated. The X- 
axis denotes the temperature (°C) and the Y-axis denotes the fluorescence intensity (Fl) 
over the background. 

15 Figure 8 shows another partial DNA sequence (SEQ ID NO: 11) and its deduced 

amino acid sequence (SEQ ID NO: 12) obtained from the hSARS virus. 

Figure 9 shows yet another partial DNA sequence (SEQ ID NO: 13) and its 
deduced amino acid sequence (SEQ ID NO: 14) obtained from the hSARS virus. 

Figure 10 shows the entire genomic DNA sequence (SEQ ID NO: 15) of the 
20 hSARS virus. 

Figure 11 shows the deduced amino acid sequences obtained from SEQ ID NO: 15 
in three frames (see SEQ ID NOS: 16, 240 and 737). An asterisk (*) indicates a stop 
codon which marks the end of a peptide. The first-frame amino acid sequences: SEQ ID 
NOS: 17-239; the second-frame amino acid sequences: SEQ ID NOS:241-736; and the 

25 third-frame amino acid sequences: SEQ ID NO:738-l 107. 

Figure 12 shows the deduced amino acid sequences obtained from the 
complement of SEQ ID NO: 15 in three frames (see SEQ ID NOS: 1 108, 1590 and 1965). 
An asterisk (*) indicates a stop codon which marks the end of a peptide. The first-frame 
amino acid sequences: SEQ ID NOS: 1109-1589; the second-frame amino acid sequences: 

30 SEQ ID NOS:1591-1964; and the third-frame amino acid sequences: SEQ ID NO:1966- 
2470. 
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Figure 13 shows the nucleic acid sequence of the forward primers (SEQ ID 
NOS:2471 and 2474), reverse primers (SEQ ID NOS:2472 and 2475), and hybridization 
probes (SEQ ID NOS:2473 and 2476) for the quantitative TaqMan® assay for hSARS 
virus detection. 

5 Figure 14 shows the standard curve for the real-time quantitative RT-PCR assay 

for SARS-CoV. .OThe threshold cycle (Ct) is the number of PGR cycles required for the 
fluorescent intensity of the reaction to reach a predefined threshold. The Ct is inversely 
proportional to the logarithm of the starting concentration of plasmid DNA. The 
correlation coefficient are indicated. Ct was calculated based on the calculated threshold 
10 value in the standard amplification plot by maximum curvature approach for different 
starting copy numbers. X-axis denotes log copy number of the standard and Y-axis 
denotes Ct. 

Figure 15 shows a representative amplification plot of fluorescence intensity 
against the number of PGR cycles for the NPA specimens isolated from the SARS 

1 5 patients, using the modified RT-PCR detection method of the present invention. With the 
modified RNA extraction protocol, 40 out of 50 NPA samples were positive in the real- 
time assay. Of those samples that were negative in the first generation RT-PCR assay, all 
were found to contain very low amounts of viral RNA by the detection method of the 
present invention. X-axis denotes the number of PGR cycles and Y-axis indicates the 

20 fluorescence intensity over background signal (ARn). 

Figure 16 is a graph showing the viral load of SARS-CoV in the clinical 
specimens in relation to the days of onset. The result indicates that the viral load 
increases as the disease progresses. Some of the samples that were positive in the first 
generation assay were found to contain very high amounts of viral RNA. X-axis denotes 

25 the days of onset and Y-axis denotes the copy numbers per reaction in the samples. 

5. DETAILED DESCRIPTION OF THE INVENTION 

The present invention relates to the use of the sequence information of the 
isolated hS ARS virus for diagnostic methods. In particular, the present invention 
provides a method for detecting the presence or absence of nucleic acid molecules of the 
30 hSARS virus, natural or artificial variants, analogs, or derivatives thereof, in a biological 
sample. The method involves obtaining a biological sample from various sources and 
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contacting the sample with a compound or an agent capable of detecting a nucleic acid 
(e.g., mRNA, genomic DNA) of the hSARS virus, natural or artificial variants, analogs, 
or derivatives thereof, such that the presence of the hSARS virus, natural or artificial 
variants, analogs, or derivatives thereof, is detected in the sample. A preferred agent for 
5 detecting hS ARS mRNA or genomic RNA is a labeled nucleic acid probe capable of 
hybridizing to mRNA or genomic RNA. In a preferred embodiment, the nucleic acid 
probe is a nucleic acid molecule comprising or consisting of the nucleic acid sequence of 
SEQ ED NO:2473 or 2476, or a portion thereof, which sufficiently specifically hybridizes 
under stringent conditions to an hS ARS mRNA or genomic RNA. In a preferred specific 

10 embodiment, the presence of the hSARS virus, natural or artificial variants, analogs, or 
derivatives thereof, is detected in the sample by a reverse transcription polymerase chain 
reaction (RT-PCR) using the primers that are constructed based on a partial nucleotide 
sequence of the hSARS virus. In a non-limiting specific embodiment, preferred primers 
to be used in a RT-PCR method are: 5 9 -CAGAACGCTGTAGCTTC AAAAATCT -3' 

15 (SEQ ID NO:2471) and 5 '-TCAGAACCCTGTGATGAATCAACAG -3' (SEQ ID 
NO: 2472), in the presence of MgCl 2 and the thermal cycles are, for example, but not 
limited to, 50°C for 2 min, 95°C for 10 minutes, and followed by 45 cycles of 95°C for 
15 seconds, 60°C for 1 min (also see Sections 6.7, 6.8, 6.9 infra). In preferred 
embodiments, the primers comprise the nucleic acid sequence of SEQ ID NOS:2471 and 

20 2472. In another non-limiting specific embodiment, preferred primers to be used in a 

RT-PCR method are: 5 '-ACCAGAATGGAGGACGCAATG-3 ' (SEQ ID NO:2474) and 
5'- GCTGTGAACCAAGACGCAGTATTAT -3' (SEQ ID NO:2475), in the presence of 
MgCl 2 and the thermal cycles are, for example, but not limited to, 50°C for 2 min, 95°C 
for 10 minutes, and followed by 45 cycles of 95°C for 15 seconds, 60°C for 1 min (also 

25 see Sections 6.7, 6.8, 6.9 infra). In preferred embodiments, the primers comprise the 
nucleic acid sequence of SEQ ID NOS:2474 and 2475. 

The methods of the present invention can involve a real-time quantitative PGR 
assay. In a preferred embodiment, the quantitative PGR used in the present invention is 
TaqMan® assay (Holland et at., Proc Natl Acad Set USA 88(\6)\1216 (1991)). The 

30 assays can be performed on an instrument designed to perform such assays, for example 
those available from Applied Biosystems (Foster City, CA). In more preferred specific 
embodiments, the present invention provides a real-time quantitative PGR assay to detect 
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the presence of the hSARS virus, natural or artificial variants, analogs, or derivatives 
thereof, in a biological sample by subjecting the cDNA obtained by reverse transcription 
of the extracted total RNA from the sample to PGR reactions using specific primers, and 
detecting the amplified product using a probe. In preferred embodiments, the probe is a 
5 TaqMan® probe which consists of an oligonucleotide with a 5 5 -reporter dye and a 3 5 - 
quencher dye. In a preferred embodiment, the probe has a nucleotide sequence of 5 3 - 
TCTGCGTAGGCAATCC-3 ' (SEQ ID NO:2473). In another preferred embodiment, the 
probe has a nucleotide sequence of 5'-ACCCCAAGGTTTACCC-3' (SEQ ID NO:2476). 
A fluorescent reporter dye, such as FAM® dye, is covalently linked to the 5 5 end of the 

1 0 oligonucleotide probe. Other dye such as TET® dye or VIC® may be used as reporter dyes. 
Each of the reporters is quenched by a TAMRA® dye at the 3' end or non-fluorescent 
quencher. In a preferred embodiment, the 3' end is labeled with NFQ-MGB. The 
fluorescence signals from these reactions are captured at the end of extension steps as 
PGR product is generated over a range of the thermal cycles, thereby allowing the 

1 5 quantitative determination of the viral load in the sample based on an amplification plot. 
Other techniques for detection of RNA may be used. For example, in vitro 
techniques for detection of mRNA include northern hybridizations, in situ hybridizations, 
RT-PCR, and RNase protection. In vitro techniques for detection of genomic RNA 
include northern hybridizations, RT-PCT, and RNase protection. 

20 As discussed above, in a preferred embodiment, the polynucleotides of the 

hSARS virus may be amplified before they are detected. The term "amplified" refers to 
the process of making multiple copies of the nucleic acid from a single polynucleotide 
molecule. The amplification of polynucleotides can be carried out in vitro by 
biochemical processes known to those of skill in the art. The amplification agent may be 

25 any compound or system that will function to accomplish the synthesis of primer 

extension products, including enzymes. Suitable enzymes for this purpose include, for 
example, E. coli DNA polymerase I, Taq polymerase, Klenow fragment of E. coli DNA 
polymerase I, T4 DNA polymerase, other available DNA polymerases, polymerase 
muteins, reverse transcriptase, ligase, and other enzymes, including heat-stable enzymes 

30 (i.e., those enzymes that perform primer extension after being subjected to temperatures 
sufficiently elevated to cause denaturation). Suitable enzymes will facilitate combination 
of the nucleotides in the proper manner to form the primer extension products that are 
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complementary to each mutant nucleotide strand. In a preferred embodiment, the enzyme 
is AmpliTaq Gold® DNA Polymerase from Applied Biosystems. Generally, the synthesis 
will be initiated at the 3 '-end of each primer and proceed in the 5 '-direction along the 
template strand, until synthesis terminates, producing molecules of different lengths. 
5 There may be amplification agents, however, that initiate synthesis at the 5 '-end and 
proceed in the other direction, using the same process as described above. In any event, 
the method of the invention is not to be limited to the embodiments of amplification 
described herein. 

One method of in vitro amplification, which can be used according to this 

10 invention, is the potymerase chain reaction (PGR) described in U.S. Patent Nos. 

4,683,202 and 4,683,195. The term "polymerase chain reaction" refers to a method for 
amplifying a DNA base sequence using a heat-stable DNA polymerase and two 
oligonucleotide primers, one complementary to the (+)-strand at one end of the sequence 
to be amplified and the other complementary to the (-)-strand at the other end. Because 

15 the newly synthesized DNA strands can subsequently serve as additional templates for 

the same primer sequences, successive rounds of primer annealing, strand elongation, and 
dissociation produce rapid and highly specific amplification of the desired sequence. The 
polymerase chain reaction is used to detect the presence of polynucleotides encoding 
cytokines in the sample. Many polymerase chain methods are known to those of skill in 

20 the art and may be used in the method of the invention. For example, DNA can be 

subjected to 30 to 35 cycles of amplification in a thermo cycler as follows: 95 °C for 30 
sec, 52° to 60°C for 1 min, and 72°C for 1 min, with a final extension step of 72°C for 5 
min. For another example, DNA can be subjected to 35 polymerase chain reaction cycles 
in a thermocycler at a denaturing temperature of 95°C for 30 sec, followed by varying 

25 annealing temperatures ranging from 54°C to 58°C for 1 min, an extension step at 70°C 
for 1 min, with a final extension step at 70°C for 5 min. 

The primers for use in amplifying the mRNA or genomic RNA of the hSARS 
virus may be prepared using any suitable method, such as conventional phosphotriester 
and phosphodiester methods or automated embodiments thereof so long as the primers 

30 are capable of hybridizing to the polynucleotides of interest. One method for 

synthesizing oligonucleotides on a modified solid support is described in U.S. Patent No. 
4,458,066. The exact length of primer will depend on many factors, including 
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temperature, buffer, and nucleotide composition. The primer must prime the synthesis of 
extension products in the presence of the inducing agent for amplification. 

Primers used according to the method of the invention are complementary to each 
strand of nucleotide sequence to be amplified. The term "complementary" means that the 
5 primers must hybridize with their respective strands under conditions, which allow the 
agent for polymerization to function. In other words, the primers that are complementary 
to the flanking sequences hybridize with the flanking sequences and permit amplification 
of the nucleotide sequence. Preferably, the 3 ? terminus of the primer that is extended has 
perfectly base paired complementarity with the complementary flanking strand. Primers 

10 and probes for polynucleotides of the hSARS virus, can be developed using known 

methods combined with the present disclosure. In preferred embodiments, the primers 
are designed according to the TaqMan® primers protocol (Applied Biosystems). The 
primers can be designed using Primer Express software as described in the Primer 
Express User Bulletin (Applied Biosystems). Briefly, when designing primers, it should 

15 be chosen after the probe. The primers are preferred to be as close as possible to the 
probe without overlapping the probe. The G-C content of the primers should be in the 
20% to 80% range. It is preferred to avoid runs of an identical nucleotide. This is 
especially true for guanine, where runs of four or more Gs is preferred to be avoided. 
The melting temperature of each primer is preferred to be 58°C to 60°C. The five 

20 nucleotides at the 3 ' end of each primer is preferred not to have more than two G and/or 
C bases. 

Probes can be designed using Primer Express software as described in the Primer 
Express User Bulletin (P/N 43 17594) (Applied Biosystems). Briefly, it is preferred to 
keep the G-C content in the 20% to 80% range. It is preferred to avoid runs of an 
25 identical nucleotide. This is especially true for guanine, where runs of four or more Gs 
should be avoided. It is preferred not to put a G base on the 5' end. It is preferred to 
select the strand that gives the probe more Cs than Gs. It is preferred that both probes be 
on the same strand. For single-probe assays, the melting temperature is preferred to be 
68°C to 70°C. 

30 Those of ordinary skill in the art will know of various amplification 

methodologies that can also be utilized to increase the copy number of target nucleic acid. 
The polynucleotides detected in the method of the invention can be further evaluated, 
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detected, cloned, sequenced, and the like, either in solution or after binding to a solid 
support, by any method usually applied to the detection of a specific nucleic acid 
sequence such as another polymerase chain reaction, oligomer restriction (Saiki et al., 
Bio/Technology 3:1008-1012 (1985)), allele-specific oligonucleotide (ASO) probe 
5 analysis (Conner et al, Proa Natl. Acad Set USA 80; 278 (1983)), oligonucleotide 
ligation assays (OLAs) (Landegren et al., Science 241.1011 (1988)), RNase Protection 
Assay and the like. Molecular techniques for DNA analysis have been reviewed 
(Landegren et al, Science 242:229-231 (1988)), Following DNA amplification, the 
reaction product may be detected by Southern blot analysis, without using radioactive 

10 probes. In such a process, for example, a small sample of DNA containing the 

polynucleotides obtained from the tissue or subject is amplified, and analyzed via a 
Southern blotting technique. The use of non-radioactive probes or labels is facilitated by 
the high level of the amplified signal. In one embodiment of the invention, one 
nucleoside triphosphate is radioactively labeled, thereby allowing direct visualization of 

15 the amplification product by autoradiography. In another embodiment, amplification 
primers are fluorescently labeled and run through an electrophoresis system. 
Visualization of amplified products is by laser detection followed by computer assisted 
graphic display, without a radioactive signal. 

The size of the primers used to amplify a portion of the mRNA or genomic RNA 

20 of the hSARS virus is at least 10, 15, 20, 25, or 30 nucleotide in length. Preferably, the 
GC ratio should be above 30%, 35%, 40%, 45%, 50%, 55%, or 60 % so as to prevent 
hair-pin structure on the primer. Furthermore, the amplicon should be sufficiently long 
enough to be detected by standard molecular biology methodologies. Preferably, the 
amplicon is at least 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 175, 200, 

25 250, 300, 350, 400, 450, 500, 550, 600, 700, 800, or 1000 base pair in length. 

In a specific embodiment, the methods further involve obtaining a control sample 
from a control subject, contacting the control sample with a compound or agent capable 
of detecting the presence of mRNA or genomic RNA in the sample, and comparing the 
presence of mRNA or genomic RNA in the control sample with the presence of mRNA 

30 or genomic DNA in the test sample. 

The invention also encompasses kits for detecting the presence of hSARS viral 
nucleic acids in a test sample. The kit, for example, can comprise a labeled compound or 
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agent capable of detecting a nucleic acid molecule in a test sample and, in certain 
embodiments, a means for determining the amount of mRNA in the sample (an 
oligonucleotide probe which binds to DNA or mRNA). 

For oligonucleotide-based kits, the kit can comprise, for example: (1) an 
5 oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic 
acid sequence of the hSARS virus and/or (2) a pair of primers useful for amplifying a 
nucleic acid molecule containing the hSARS viral sequence. The kit can also comprise, 
e.g., a buffering agent, a preservative, or a protein stabilizing agent. The kit can also 
comprise components necessary for detecting the detectable agent (e.g., an enzyme or a 
10 substrate). The kit can also contain a control sample or a series of control samples which 
can be assayed and compared to the test sample contained. Each component of the kit is 
usually enclosed within an individual container and all of the various containers are 
usually enclosed within a single package along with instructions for use. 

5.1. Nucleic Acid Sequences of hSARS Viruses 

1 5 The invention relates to the use of the sequence information of the isolated virus 

for diagnostic and therapeutic methods. The entire genome sequence of the hSARS virus, 
CCTCC-V200303 is disclosed in a United States Patent Application with Attorney 
Docket No. V966 1.0069 filed concurrently herewith on March 24, 2004, which is 
incorporated by reference in its entirety. In a specific embodiment, the invention 

20 provides the entire nucleotide sequence of the hSARS virus, CCTCC-V200303, SEQ ID 
NO: 15, or a complement, analog, derivative, or fragment thereof, or a portion thereof. 
Furthermore, the present invention relates to a nucleic acid molecule that hybridizes to 
any portion of the genome of the hSARS virus, CCTCC-V200303, SEQ ID NO: 15, under 
the stringent conditions. In a specific embodiment, the invention provides nucleic acid 

25 molecules which are suitable for use as primers consisting of or comprising the nucleic 
acid sequence of SEQ ID NO:l, 3, 4, 11 or 13, or a complement, analog, derivative, or 
fragment thereof, or a portion thereof. In preferred specific embodiments, the primers 
comprise the nucleic acid sequence of SEQ ID NO:2471, 2472, 2474 or 2475. In another 
specific embodiment, the invention provides nucleic acid molecules which are suitable 

30 for use as hybridization probes for the detection of nucleic acids encoding a polypeptide 
of the invention, consisting of or comprising the nucleic acid sequence of SEQ ID 1, 1 1, 
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13, 15, 16, 240, 737, 1108, 1590, 1965, 2471, 2472, 2473, 2474, 2475, or 2476 or a 
complement, analog, derivative, or fragment thereof, or a portion thereof. In another 
embodiment, the invention relates to a kit comprising primers having the nucleic acid 
sequence of SEQ ED NOS:2471 and/or 2472 for the detection of the hSARS virus, natural 
5 or artificial variants, analogs, or derivatives thereof. In a preferred embodiment, the kit 
further contains a probe having the nucleic acid sequence of SEQ ID NO: 2473. In 
another embodiment, the invention relates to a kit comprising primers having the nucleic 
acid sequence of SEQ ID NOS:2474 and/or 2475 for the detection of the hSARS virus, 
natural or artificial variants, analogs, or derivatives thereof. In a preferred embodiment, 
10 the kit further contains a probe having the nucleic acid sequence of SEQ ID NO:2476. In 
another preferred embodiment, the kit further comprises reagents for the detection of 
genes not found in the hS ARS virus as a negative control. The invention further 
encompasses chimeric or recombinant viruses or viral proteins encoded by said 
nucleotide sequences. 

1 5 The present invention also relates to the isolated nucleic acid molecules of the 

hS ARS virus, comprising, or, alternatively, consisting of the nucleic acid sequence of 
SEQ ID NO: 1, 11, 13, 15, 16, 240, 737, 1108, 1590, 1965, 2471, 2472, 2473, 2474, 2475 
or 2476, or a complement, analog, derivative, or fragment thereof, or a portion thereof. 
In another specific embodiment, the invention provides isolated nucleic acid molecules 

20 which hybridize under stringent conditions, as defined herein, to a nucleic acid molecule 
having the nucleic acid sequence of SEQ ID NOS: 1,11,15, 13, 16, 240, 737, 1 108, 1590, 
1965, 2471, 2472, 2473, 2474, 2475 or 2476, or specific genes of known member of 
Coronaviridae, or a complement, analog, derivative, or fragment thereof, or a portion 
thereof In another specific embodiment, the invention provides isolated polypeptides or 

25 proteins that are encoded by a nucleic acid molecule comprising a nucleotide sequence 
that is at least about 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 
500, 550, 600, or more contiguous nucleotides of the nucleic acid sequence of SEQ ID 
NO: 1, or a complement, analog, derivative, or fragment thereof. In another specific 
embodiment, the invention provides isolated polypeptides or proteins that are encoded by 

30 a nucleic acid molecule comprising a nucleotide sequence that is at least about 5, 10, 15, 
20, 25, 30, 35, 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 
800, 850, 900, 950, 1,000, 1,050, 1,100, 1,150, 1,200, or more contiguous nucleotides of 
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the nucleic acid sequence of SEQ ID NO: 1 1, or a complement, analog, derivative, or 
fragment thereof. In yet another specific embodiment, the invention provides isolated 
polypeptides or proteins that are encoded by a nucleic acid molecule comprising a 
nucleotide sequence that is at least about 5, 10, 15, 20, 25, 30, 35, 40, 45, 100, 150, 200, 
5 300, 350, 400, 450, 500, 550, 600, 650, 700, or more contiguous nucleotides of the 
nucleic acid sequence of SEQ ID NO: 13, or a complement, analog, derivative, or 
fragment thereof. In yet another specific embodiment, the invention provides isolated 
polypeptides or proteins that are encoded by a nucleic acid molecule comprising or, 
alternatively consisting of a nucleotide sequence that is at least 5, 10, 15, 20, 25, 30, 35, 

10 40, 45, 100, 150, 200, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 
950, 1,000, 1,050, 1,100, 1,150, 1,200, 2,000, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 
9,000, 10,000, 11,000, 12,000, 13,000, 14,000, 15,000, 16,000, 17,000, 18,000, 19,000, 
20,000, 21,000, 22,000, 23,000, 24,000, 25,000, 26,000, 27,000, 28,000, 29,000 or more 
contiguous nucleotides of the nucleic acid sequence of SEQ ID NO: 15, or a complement, 

15 analog, derivative, or fragment thereof. The polypeptides include those shown in Figures 
11 (SEQ ID NOS: 17-239, 241-736, and 738-1107) and 12 (SEQ ID NOS:l 109-1589, 
1591-1964, and 1966-2470). The polypeptides or the proteins of the present invention 
preferably have one or more biological activities of the proteins encoded by the nucleic 
acid sequence of SEQ ID NO:l, 11, 13, 15, 16, 240, 737, 1108, 1590, 1965, 2471, 2472, 

20 2473, 2474, 2475 or 2476, or the native viral proteins containing the amino acid 

sequences encoded by the nucleic acid sequence of SEQ ID NO: 1, 1 1, 13, 15, 16, 240, 
737, 1108, 1590, 1965, 2471, 2472, 2473, 2474, 2475 or 2476. 

The invention further provides antibodies that specifically bind a polypeptide of 
the invention encoded by the nucleic acid sequence of SEQ ID NO: 1, 11, 13, 16, 240, 

25 737, 1 108, 1590, 1965, 2471, 2472, 2473, 2474, 2475 or 2476, or a fragment thereof, or 
any hSARS epitope. The invention further provides antibodies that specifically bind the 
polypeptides of the invention encoded by the nucleic acid sequence of SEQ ID NO: 15, or 
a fragment thereof, or any hSARS epitope. Such antibodies include, but are not limited 
to polyclonal, monoclonal, bi-specific, multi-specific, human, humanized, chimeric 

30 antibodies, single chain antibodies, Fab fragments, F(ab') 2 fragments, disulfide-linked 
Fvs, intrabodies and fragments containing either a VL or VH domain or even a 
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complementary determining region (CDR) that specifically binds to a polypeptide of the 
invention. 

In another embodiment, the invention provides vaccine preparations comprising 
the hSARS virus, natural or artificial variants, analogs, or derivatives thereof. In yet 
5 another embodiment, the invention provides vaccine preparations comprising 

recombinant and chimeric forms of the hSARS virus, or subunits of the virus. In a 
specific embodiment, the vaccine preparations comprise live but attenuated hSARS virus 
with or without pharmaceutical^ acceptable excipients, including adjuvants. In another 
specific embodiment, the vaccine preparations comprise an inactivated or killed hSARS 

10 virus with or without pharmaceutical^ acceptable excipients, including adjuvants. The 
vaccine preparations of the present invention may further comprise adjuvants. 
Accordingly, the present invention further provides methods of preparing recombinant or 
chimeric forms of the hSARS virus. In another specific invention, the vaccine 
preparations of the present invention comprise one or more nucleic acid molecules 

15 comprising or consisting of the nucleic acid sequence of SEQ ID NO: 1, 1 1, 13, 15, 16, 

240, 737, 1108, 1590, 1965, 2471, 2472, 2473, 2474, 2475 or 2476, or a fragment thereof 
In another embodiment, the invention provides vaccine preparations comprising one or 
more polypeptides of the invention encoded by a nucleotide sequence comprising or 
consisting of the nucleic acid sequence of SEQ ID NO: 1, 11, 13, 16, 240, 737, 1108, 

20 1590, 1965, 2471, 2472, 2473, 2474, 2475 or 2476, or a fragment thereof. In another 
embodiment, the invention provides vaccine preparations comprising one or more 
polypeptides of the invention encoded by a nucleotide sequence comprising or consisting 
of the nucleic acid sequence of SEQ ID NO: 15 ? or a fragment thereof Further, the 
present invention provides methods for treating, ameliorating, managing, or preventing 

25 SARS by administering the vaccine preparations or antibodies of the present invention 
alone or in combination with antivirals (e.g., amantadine, rimantadine, gancyclovir, 
acyclovir, ribavirin, penciclovir, oseltamivir, foscarnet zidovudine (AZT), didanosine 
(ddl), lamivudine (3TC), zalcitabine (ddC), stavudine (d4T), nevirapine, delavirdine, 
indinavir, ritonavir, vidarabine, nelfinavir, saquinavir, relenza, tamiflu, pleconaril, 

30 interferons, etc.), steroids and corticosteroids such as prednisone, cortisone, fluticasone 
and glucocorticoid, antibiotics, analgesics, bronchodialaters, or other treatments for 
respiratory and/or viral infections. 
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Furthermore, the present invention provides pharmaceutical compositions 
comprising anti- viral agents of the present invention and a pharmaceutically acceptable 
carrier. The present invention also provides kits comprising pharmaceutical 
compositions of the present invention. 
5 In another aspect, the present invention provides methods for screening anti-viral 

agents that inhibit the infectivity or replication of the hSARS virus, natural or artificial 
variants, analogs, or derivatives thereof. 

In one embodiment, the invention provides methods for detecting the presence, 
activity or expression of the hSARS virus, natural or artificial variants, analogs, or 
10 derivatives thereof, of the invention in a biological material, such as cells, blood, serum, 
plasma, saliva, urine, stool, sputum, nasopharyngeal aspirates, and so forth. The presence 
of the hSARS virus, natural or artificial variants, analogs, or derivatives thereof, in a 
sample can be determined by contacting the biological material with an agent which can 
detect directly or indirectly the presence of the hSARS virus, natural or artificial variants, 
15 analogs, or derivatives thereof In a specific embodiment, the detection agents are the 
antibodies of the present invention. In another embodiment, the detection agent is a 
nucleic acid of the present invention. 

5,2. hSARS Viruses 

5.2.1. Natural variants of hSARS viruses 

20 The present invention is based upon the inventor's isolation and identification of a 

novel virus from subjects suffering from SARS. The isolated hSARS virus is that which 
was deposited with the China Center for Type Culture Collection (CCTCC) on April 2, 
2003 and accorded an accession number, CCTCC- V2003 03. The invention also relates 
to natural variants of the hSARS virus of deposit accession no. CCTCC- V2003 03 . 

25 A natural variant of hSARS virus has a sequence that is different from the 

genomic sequence of the hS ARS virus due to one or more naturally occurred mutations, 
including, but not limited to, point mutations, rearrangements, insertions, deletions, etc., 
to the genomic sequence that may or may not result in a phenotypic change. Preferably, 
the variants include less than 25, 20, 15, 10, 5, 4, 3, or 2 amino acid substitutions, 

30 rearrangements, insertions, and/or deletions relative to the hSARS virus. 
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Either conservative or non-conservative amino acid substitutions can be made at 
one or more amino acid residues. In preferred embodiments, the variants have 
conservative amino acid substitutions that are made at one or more predicted non- 
essential amino acid residues (i.e., amino acid residues which are not critical for the 
5 expression of the biological activities of the virus, e.g., infectivity, replication ability, 

protein synthesis ability, assembling ability, and cytotoxic effect). In other embodiments, 
the variants have non-conservative amino acid substitutions that are made at one or more 
predicted non-essential amino acid residues (i.e., amino acid residues which are not 
critical for the expression of the biological activities of the virus, e.g., infectivity, 

10 replication ability, protein synthesis ability, assembling ability, and cj^totoxic effect). 

A "conservative amino acid substitution" is one in which the amino acid residue 
is replaced with an amino acid residue having a side chain with a similar charge. A "non- 
conservative amino acid substitution" is one in which the amino acid residue is replaced 
with an amino acid residue having a side chain with an opposite charge. Families of 

15 amino acid residues having side chains with similar charges have been defined in the art. 
Genetically encoded amino acids are can be divided into four families: (1) acidic = 
aspartate, glutamate; (2) basic = lysine, arginine, histidine; (3) nonpolar = alanine, valine, 
leucine, isoleucine, proline, phenylalanine, methionine, tryptophan; and (4) uncharged 
polar = glycine, asparagine, glutamine, cysteine, serine, threonine, tyrosine. In similar 

20 fashion, the amino acid repertoire can be grouped as (1) acidic = aspartate, glutamate; (2) 
basic = lysine, arginine histidine, (3) aliphatic = glycine, alanine, valine, leucine, 
isoleucine, serine, threonine, with serine and threonine optionally be grouped separately 
as aliphatic-hydroxyl; (4) aromatic = phenylalanine, tyrosine, tryptophan; (5) amide = 
asparagine, glutamine; and (6) sulfur -containing = cysteine and methionine. (See, for 

25 example, Biochemistry, 4th ed., Ed. by L. Stryer, WH Freeman and Co.: 1995). 

The invention further relates to mutant hSARS virus. In one embodiment, 
mutations can be introduced randomly along all or part of the coding sequence of the 
hSARS virus or variants thereof, such as by saturation mutagenesis, and the resultant 
mutants can be screened for biological activity to identify mutants that retain activity. 

30 Techniques for mutagenesis known in the art can also be used, including but not limited 
to, point- directed mutagenesis, chemical mutagenesis, in vitro site-directed mutagenesis, 
using, for example, the QuikChange Site-Directed Mutagenesis Kit (Stratagene), etc. 
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Non-limiting examples of such modifications include substitutions of amino acids to 
cysteines toward the formation of disulfide bonds; substitution of amino acids to tyrosine 
and subsequent chemical treatment of the polypeptide toward the formation of dityrosine 
bonds, as disclosed in detail herein; one or more amino acid substitutions and/or 
5 biological or chemical modification toward generating a binding pocket for a small 

molecule (substrate or inhibitor), and/or the introduction of side-chain specific tags {e.g., 
to characterize molecular interactions or to capture protein-protein interaction partners). 
In a specific embodiment, the biological modification comprises alkylation, 
phosphorylation, sulfation, oxidation or reduction, ADP -rib o sy 1 at ion, hydroxylation, 
10 glycosylation, glucosylphosphatidylinositol addition, ubiquitination. In another specific 
embodiment, the chemical modification comprises altering the charge of the recombinant 
virus. In yet another embodiment, a positive or negative charge is chemically added to an 
amino acid residue where a charged amino acid residue is modified to an uncharged 
residue. 

15 5.2.2. Recombinant and chimeric hSARS viruses 

The present invention also encompasses recombinant or chimeric viruses encoded 
by viral vectors derived from the genome of hSARS virus or natural variants thereof. In 
a specific embodiment, a recombinant virus is one derived from the hS ARS virus of 
deposit accession no. CCTCC-V200303. In a specific embodiment, the virus has a 

20 nucleic acid sequence of SEQ ID NO: 15. In another specific embodiment, a recombinant 
virus is one derived from a natural variant of hSARS virus. A natural variant of hSARS 
virus has a sequence that is different from the genomic sequence (SEQ ID NO: 15) of the 
hSARS virus, CCTCC-V200303, due to one or more naturally occurred mutations, 
including, but not limited to, point mutations, rearrangements, insertions, deletions, 

25 substitution, etc., to the genomic sequence that may or may not result in a phenotypic 
change. In accordance with the present invention, a viral vector which is derived from 
the genome of the hSARS virus, CCTCC-V200303, is one that contains a nucleic acid 
sequence that encodes at least a part of one ORF of the hSARS virus. In a specific 
embodiment, the ORF comprises or consists of the nucleic acid sequence of SEQ ID NO: 

30 1, 1 1, or 13, or a fragment thereof. In a specific embodiment, there are more than one 
ORF within the nucleic acid sequence of SEQ ID NO: 15, as shown in Figures 1 1 (see 
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SEQ ID NOS: 16, 240 and 737) and 12 (see SEQ ID NOS: 1 108, 1590 and 1965), or a 
fragment thereof. In another embodiment, the polypeptide encoded by the ORF 
comprises or consists of the amino acid sequence of SEQ ID NO: 2, 12 or 14 or a 
fragment thereof, or shown in Figures 1 1 (SEQ ID NO: 17-239, 241-736 or 738-1 107) 
5 and 12 (SEQ ID NO:1109-1589, 1591-1064 or 1966-2470), or a fragment thereof. In 
accordance with the present invention these viral vectors may or may not include nucleic 
acids that are non-native to the viral genome. 

In another specific embodiment, a chimeric virus of the invention is a 
recombinant hSARS virus which further comprises a heterologous nucleotide sequence. 

10 In accordance with the invention, a chimeric virus may be encoded by a nucleotide 

sequence in which heterologous nucleotide sequences have been added to the genome or 
in which endogenous or native nucleotide sequences have been replaced with 
heterologous nucleotide sequences. 

According to the present invention, the chimeric viruses are encoded by the viral 

15 vectors of the invention which further comprise a heterologous nucleotide sequence. In 
accordance with the present invention a chimeric virus is encoded by a viral vector that 
may or may not include nucleic acids that are non-native to the viral genome. In 
accordance with the invention a chimeric virus is encoded by a viral vector to which 
heterologous nucleotide sequences have been added, inserted or substituted for native or 

20 non-native sequences. In accordance with the present invention, the chimeric virus may 
be encoded by nucleotide sequences derived from different strains or variants of hSARS 
virus. In particular, the chimeric virus is encoded by nucleotide sequences that encode 
antigenic polypeptides derived from different strains or variants of hSARS virus. 

A chimeric virus may be of particular use for the generation of recombinant 

25 vaccines protecting against two or more viruses (Tao et al, X Virol. 72:2955-2961; 

Durbinef a/., 2000, J. Virol 74:6821-6831; Skiadopoulos et al, 1998,./ Virol 72:1762- 
1768; Teng et al, 2000, J. Virol 74:9317-9321). For example, it can be envisaged that a 
virus vector derived from the hSARS virus expressing one or more proteins of variants of 
hSARS virus, or vice versa, will protect a subject vaccinated with such vector against 

30 infections by both the native hSARS virus and the variant. Attenuated and replication- 
defective viruses may be of use for vaccination purposes with live vaccines as has been 
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suggested for other viruses. (See PCT WO 02/057302, at pp.6 and 23, incorporated by 
reference herein). 

In accordance with the present invention the heterologous sequence to be 
incorporated into the viral vectors encoding the recombinant or chimeric viruses of the 
5 invention include sequences obtained or derived from different strains or variants of the 
hSARS virus. 

In certain embodiments, the chimeric or recombinant viruses of the invention are 
encoded by viral vectors derived from viral genomes wherein one or more sequences, 
intergenic regions, termini sequences, or portions or entire ORP have been substituted 
10 with a heterologous or non-native sequence. In certain embodiments of the invention, the 
chimeric viruses of the invention are encoded by viral vectors derived from viral 
genomes wherein one or more heterologous sequences have been inserted or added to the 
vector. 

The selection of the viral vector may depend on the species of the subject that is 
15 to be treated or protected from a viral infection. If the subject is human, then an 
attenuated hSARS virus can be used to provide the antigenic sequences. 

In accordance with the present invention, the viral vectors can be engineered to 
provide antigenic sequences which confer protection against infection by the hSARS 
virus, natural or artificial variants, analogs, or derivatives thereof. The viral vectors may 
20 be engineered to provide one, two, three or more antigenic sequences. In accordance 
with the present invention the antigenic sequences may be derived from the same virus, 
from different strains or variants of the same type of virus, or from different viruses. 

The expression products and/or recombinant or chimeric virions obtained in 
accordance with the invention may advantageously be utilized in vaccine formulations. 
25 The expression products and chimeric virions of the present invention may be engineered 
to create vaccines against a broad range of pathogens, including viral and bacterial 
antigens, tumor antigens, allergen antigens, and auto antigens involved in autoimmune 
disorders. In particular, the chimeric virions of the present invention may be engineered 
to create vaccines for the protection of a subject from infections with the hSARS virus, 
30 natural or artificial variants, analogs, or derivatives thereof. 

In certain embodiments, the expression products and recombinant or chimeric 
virions of the present invention may be engineered to create vaccines against a broad 
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range of pathogens, including viral antigens, tumor antigens and auto antigens involved 
in autoimmune disorders. One way to achieve this goal involves modifying "existing 
hSARS genes to contain foreign sequences in their respective external domains. Where 
the heterologous sequences are epitopes or antigens of pathogens, these chimeric viruses 
5 may be used to induce a protective immune response against the disease agent from 
which these determinants are derived. 

Thus, the present invention relates to the use of viral vectors and recombinant or 
chimeric viruses to formulate vaccines against a broad range of viruses and/or antigens. 
The present invention also encompasses recombinant viruses comprising a viral vector 

10 derived from the hSARS virus, natural or artificial variants, analogs, or derivatives 
thereof, which contains sequences which result in a virus having a phenotype more 
suitable for use in vaccine formulations, e.g., attenuated phenotype or enhanced 
antigenicity. The mutations and modifications can be in coding regions, in intergenic 
regions and in the leader and trailer sequences of the virus. 

15 The invention provides a host cell comprising a nucleic acid or a vector according 

to the invention. Plasmid or viral vectors containing the polymerase components of the 
hS ARS virus are generated in prokaryotic cells for the expression of the components in 
relevant cell types (bacteria, insect cells, eukaryotic cells). Plasmid or viral vectors 
containing full-length or partial copies of the hS ARS genome will be generated in 

20 prokaryotic cells for the expression of viral nucleic acids in vitro or in vivo. The latter 
vectors may contain other viral sequences for the generation of chimeric viruses or 
chimeric virus proteins, may lack parts of the viral genome for the generation of 
replication defective virus, and may contain mutations, deletions, substitutions, or 
insertions for the generation of attenuated viruses. 

25 The present invention also provides a host cell comprising a nucleic acid molecule 

of the present invention. In addition, the present invention provides a host cell infected 
with the hSARS virus, for example, of deposit no. CCTCC-V200303, or the natural or 
artificial variants, analogs, or derivatives thereof In a specific embodiment, the 
invention encompasses a continuous cell line infected with the hSARS virus. Preferably, 

30 the cell line is a primate cell line. These cell lines may be cultured and maintained using 
known cell culture techniques such as described in Celis, Julio, ed., 1994, Cell Biology 
Laboratory Handbook, Academic Press, N.Y. Various culturing conditions for these cells, 
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including media formulations with regard to specific nutrients, oxygen, tension, carbon 
dioxide and reduced serum levels, can be selected and optimized by one of skill in the art. 

The preferred cell line of the present invention is a eukaryotic cell line, preferably 
a primate cell line, more preferably a monkey cell line, most preferably a fetal rhesus 
5 monkey kidney cell line (e.g., FKhK-4), transiently or stably expressing one or more full- 
length or partial hS ARS proteins. Such cells can be made by transfection (proteins or 
nucleic acid vectors), infection (viral vectors) or transduction (viral vectors) and may be 
useful for complementation of mentioned wild-type, attenuated, replication-defective or 
chimeric viruses. The cell lines for use in the present invention can be cloned using 

10 known cell culture techniques familiar to one skilled in the art. The cells can be cultured 
and expanded from a single cell using commercially available culture media under known 
conditions suitable for propagating cells. 

For example, the cell lines of the present invention kept frozen until use, can be 
warmed at a temperature of about 37°C and then added to a suitable growth medium such 

15 as DMEM/F-12 (Life Technologies, Inc.) containing 3% fetal bovine serum (FBS). The 
cells can be incubated at a temperature of about 37°C in a humidified incubator with 
about 5% C0 2 until confluent. In order to passage the cells, the growth medium can be 
removed 0.05% trypsin and 0.53mM EDTA added to the cells. The cells will detach and 
the cell suspension can be collected into centrifuge tubes and centrifuged into cell pellets. 

20 The trypsin solution can be removed and the cell pellet resuspended into new growth 
medium. The cells can then be further propagated in additional growth vessels to a 
desired density. 

In accordance with the present invention, a continuous cell line encompasses 
immortalized cells which can be maintained in- vitro for at least 5, 10, 15, 20, 25, or 50 
25 passages. 

Infectious copies of hSARS virus (being wild type, attenuated, replication- 
defective or chimeric) can be produced upon co-expression of the polymerase 
components according to the state-of-the-art technologies described above. 

In addition, eukaryotic cells, transiently or stably expressing one or more full- 
30 length or partial hSARS proteins can be used. Such cells can be made by transfection 
(proteins or nucleic acid vectors), infection (viral vectors) or transduction (viral vectors) 
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and may be useful for complementation of mentioned wild type, attenuated, replication- 
defective or chimeric viruses. 

The viral vectors and chimeric viruses of the present invention may be used to 
modulate a subject's immune system by stimulating a humoral immune response, a 
5 cellular immune response or by stimulating tolerance to an antigen. As used herein, a 
subject means: humans, primates, horses, cows, sheep, pigs, goats, dogs, cats, avian 
species and rodents. 

5.3. Vaccines and Antiviral^ 

In a preferred embodiment, the invention provides a proteinaceous molecule or 

10 hSARS virus specific viral protein or functional fragment thereof encoded by a nucleic 
acid according to the invention. Useful proteinaceous molecules are for example derived 
from any of the genes or genomic fragments derivable from the virus according to the 
invention, including envelop protein (E protein), integral membrane protein (M protein), 
spike protein (S protein), nucleocapsid protein (N protein), hemaglutinin esterase (BE 

1 5 protein), and RNA-dependent RNA polymerase. Such molecules, or antigenic fragments 
thereof, as provided herein, are for example useful in diagnostic methods or kits and in 
pharmaceutical compositions such as subunit vaccines. Particularly useful are 
polypeptides encoded by the nucleic acid sequence of SEQ ED NO: 1, 1 1, 13, 15, 2471, 
2472, 2473, 2474, 2475 or 2476, or as shown in Figures 1 1 (SEQ ID NO: 17-239, 241- 

20 736 or 738-1107) and 12 (SEQ ID NO: 1 109-1589, 1591-1964, 1966-2470), or antigenic 
fragments thereof for inclusion as antigen or subunit immunogen, but inactivated whole 
virus can also be used. Particularly useful are also those proteinaceous substances that 
are encoded by recombinant nucleic acid fragments of the hSARS genome, more 
preferred are those that are within the preferred bounds and metes of ORFs, in particular, 

25 for eliciting hS ARS specific antibody or T cell responses, whether in vivo (e.g., for 

protective or therapeutic purposes or for providing diagnostic antibodies) or in vifro (e.g., 
by phage display technology or another technique useful for generating synthetic 
antibodies). 

5.3.1. Attenuation of hSARS viruses and variants Thereof 

30 The hSARS virus or variants thereof of the invention can be genetically 

engineered to exhibit an attenuated phenotype. In particular, the viruses of the invention 
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exhibit an attenuated phenotype in a subject to which the virus is administered as a 
vaccine. Attenuation can be achieved by any method known to a skilled artisan. Without 
being bound by theory, the attenuated phenotype of the viruses of the invention can be 
caused, e.g., by using a virus that naturally does not replicate well in an intended host 
5 species, for example, by reduced replication of the viral genome, by reduced ability of the 
virus to infect a host cell, or by reduced ability of the viral proteins to assemble to an 
infectious viral particle relative to the wild-type strain of the virus. 

In one embodiment, the infectivity of the virus is reduced by 10,000-fold, 9,000- 
fold, 8,000-fold, 7,000-fold, 6,000-fold, 5,000-fold, 4,000-fold, 3,000-fold, 2,500-fold, 

10 2,000-fold, 1,500-fold, 1,250-fold, 1,000-fold, 900-fold, 800-fold, 700-fold, 600-fold, 

500-fold, 400-fold, 300-fold, 200-fold, 100-fold, 50-fold, 25-fold, 10-fold, 5-fold, 1-fold, 
or 90%, 80%, 70%, 60%, 50%, 40%, 30%, 20%, or 10%. As used herein, the term 
"infectivity" refers to the ability of the virus to enter, survive, and multiply in a 
susceptible host. In a specific embodiment, the infectivity of the hSARS virus is said to 

15 be attenuated or reduced when grown in a human host if the growth of the hSARS virus 
or variant thereof in the human host is reduced compared to the non-attenuated hSARS 
virus or variant thereof. The infectivity of the virus can be measured using a variety of 
methods such as, but not limited to, Western blot (proteins), Southern blot (RNA), 
Northern blot (DNA), plaque formation assay, colorimetric, microscopically, and 

20 chemiluminescence techniques. The infectivity of the virus can be measured in an animal 
cell, preferably a primate cell, more preferably a monkey cell, most preferably a human 
cell. 

In another embodiment, the replication ability of the virus is reduced by 10,000- 
fold, 9,000-fold, 8,000-fold, 7,000-fold, 6,000-fold, 5,000-fold, 4,000-fold, 3,000-fold, 

25 2,500-fold, 2,000-fold, 1,500-fold, 1,250-fold, 1,000-fold, 900-fold, 800-fold, 700-fold, 
600-fold, 500-fold, 400-fold, 300-fold, 200-fold, 100-fold, 50-fold, 25-fold, 10-fold, 5- 
fold, 1-fold, or 90%, 80%, 70%, 60%, 50%, 40%, 30%, 20%, or 10%. As used herein, 
the term "replication ability" refers to the ability of the virus to duplicate, multiply, 
and/or reproduce. The replication ability can be determined using the doubling time, the 

30 rate of replication, the growth rate, and/or the half-life of the virus. In a specific 

embodiment, the replication ability of the hSARS virus is said to be attenuated or reduced 
when grown in a human host if the growth of the hSARS virus or variant thereof in the 
human host is reduced compared to the non-attenuated hSARS virus or variant thereof. 
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The replication ability of the virus can be measured using a variety of methods such as, 
but not limited to, Western blot (proteins), Southern blot (RNA), Northern blot (DNA), 
plaque formation assay, colorimetric, microscopically, and chemiluminescence 
techniques. In some cases, replication and transcription may be synonymous. The 
5 replication ability of the virus can be measured in an animal cell, preferably a primate cell, 
more preferably a monkey cell, most preferably a human cell. 

In another embodiment, the protein synthesis ability of the virus is reduced by 
10,000-fold, 9,000-fold, 8,000-fold, 7,000-fold, 6,000-fold, 5,000-fold, 4,000-fold, 3,000- 
fold, 2,500-fold, 2,000-fold, 1,500-fold, 1,250-fold, 1,000-fold, 900-fold, 800-fold, 700- 

10 fold, 600-fold, 500-fold, 400-fold, 300-fold, 200-fold, 100-fold, 50-fold, 25-fold, 10-fold, 
5-fold, 1-fold, or 90%, 80%, 70%, 60%, 50%, 40%, 30%, 20%, or 10%. As used herein, 
the term "protein synthesis ability" refers to the ability of the virus to synthesize proteins 
such as, but not limited to, envelope protein (E protein), integral membrane protein (M 
protein), spike protein (S protein), nucleocapsid protein (N protein), hemaglutinin 

1 5 esterase (HE protein), and RNA-dependent RNA polymerase. The protein synthesis 
ability can be determined by the rate of protein synthesis (e.g., transcription level, 
translation level), and the types and amount of protein synthesized by the virus. In a 
specific embodiment, the protein synthesis ability of the hSARS virus is said to be 
attenuated or reduced when grown in a human host if the growth of the hS ARS virus or 

20 variant thereof in the human host is reduced compared to the non-attenuated hSARS virus 
or variant thereof The protein synthesis ability of the virus can be measured using a 
variety of methods such as, but not limited to, Western blot (proteins), Southern blot 
(RNA), Northern blot (DNA), plaque formation assay, colorimetric, microscopically, and 
chemiluminescence techniques. The protein synthesis ability of the virus can be 

25 measured in an animal cell, preferably a primate cell, more preferably a monkey cell, 
most preferably a human cell. 

In another embodiment, the assembling ability of the virus is reduced by 10,000- 
fold, 9,000-fold, 8,000-fold, 7,000-fold, 6,000-fold, 5,000-fold, 4,000-fold, 3,000-fold, 
2,500-fold, 2,000-fold, 1,500-fold, 1,250-fold, 1,000-fold, 900-fold, 800-fold, 700-fold, 

30 600-fold, 500-fold, 400-fold, 300-fold, 200-fold, 100-fold, 50-fold, 25-fold, 10-fold, 5- 
fold, 1-fold, or 90%, 80%, 70%, 60%, 50%, 40%, 30%, 20%, or 10%. As used herein, 
the term "assembling ability" refers to the ability of the virus to assemble the necessary 
proteins or protein components into a viral particle. In a specific embodiment, the 
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assembling ability of the hSARS virus is said to be attenuated or reduced when grown in 
a human host if the growth of the hSARS virus or variant thereof in the human host is 
reduced compared to the non-attenuated hSARS virus or variant thereof The assembling 
ability of the virus can be measured using a variety of methods such as, but not limited to, 
Western blot (proteins), Southern blot (RNA), Northern blot (BNA), plaque formation 
assay, colorimetric, microscopically, and chemiluminescence techniques. The 
assembling ability of the virus can be measured in an animal cell, preferably a primate 
cell, more preferably a monkey cell, most preferably a human cell. 

In another embodiment, the cytopathic effect of the virus is reduced by 10,000- 
fold, 9,000-fold, 8,000-fold, 7,000-fold, 6,000-fold, 5,000-fold, 4,000-fold, 3,000-fold, 
2,500-fold, 2,000-fold, 1,500-fold, 1,250-fold, 1,000-fold, 900-fold, 800-fold, 700-fold, 
600-fold, 500-fold, 400-fold, 300-fold, 200-fold, 100-fold, 50-fold, 25-fold, 10-fold, 5- 
fold, 1-fold, or 90%, 80%, 70%, 60%, 50%, 40%, 30%, 20%, or 10%. As used herein, 
the term "cytopathic effect" refers to damages to infected host cells caused by the 
infecting virus. Viral infection can lead to cell abnormalities (biochemical and 
morphological) and/or cell death (e.g., lysis). In a specific embodiment, the cytopathic 
effect of the hSARS virus is said to be attenuated or reduced when grown in a human 
host if the growth of the hSARS virus or variant thereof in the human host is reduced 
compared to the non-attenuated hSARS virus or variant thereof The cytopathic effect of 
the virus can be measured using a variety of methods such as, but not limited to, Western 
blot (proteins), Southern blot (RNA), Northern blot (DNA), plaque formation assay, 
colorimetric, microscopically, and chemiluminescence techniques. The cytopathic effect 
of the virus can be measured in an animal cell, preferably a primate cell, more preferably 
a monkey cell, most preferably a human cell. 

The viruses of the invention can be attenuated such that one or more of the 
functional characteristics of the virus are impaired. The attenuated phenotypes of hSARS 
virus or variants thereof can be tested by any method known to the artisan. A candidate 
virus can, for example, be tested for its ability to infect a host or for the rate of replication 
in a cell culture system. In certain embodiments, growth curves at different temperatures 
are used to test the attenuated phenotype of the virus. For example, an attenuated virus is 
able to grow at 35°C, but not at 39°C or 40°C. In certain embodiments, different cell lines 
can be used to evaluate the attenuated phenotype of the virus. For example, an attenuated 
virus may only be able to grow in monkey cell lines but not the human cell lines, or the 
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achievable virus titers in different cell lines are different for the attenuated virus. In 
certain embodiments, viral replication in the respiratory tract of a small animal model, 
including but not limited to, hamsters, cotton rats, mice and guinea pigs, is used to 
evaluate the attenuated phenotypes of the virus. In other embodiments, the immune 
5 response induced by the virus, including but not limited to, the antibody titers (e.g., 
assayed by plaque reduction neutralization assay or ELISA) is used to evaluate the 
attenuated phenotypes of the virus. In a specific embodiment, the plaque reduction 
neutralization assay or ELISA is carried out at a low dose. In certain embodiments, the 
ability of the hSARS virus to elicit pathological symptoms in an animal model can be 

10 tested. A reduced ability of the virus to elicit pathological symptoms in an animal model 
system is indicative of its attenuated phenotype. In a specific embodiment, the candidate 
viruses are tested in a monkey model for nasal infection, indicated by mucous production. 

In certain other embodiments, attenuation is measured in comparison to the wild- 
type strain of the virus from which the attenuated virus is derived. In other embodiments, 

15 attenuation is determined by comparing the growth of an attenuated virus in different host 
systems. Thus, for a non-limiting example, the hSARS virus or a variant thereof is said 
to be attenuated when grown in a human host if the growth of the hS ARS or variant 
thereof in the human host is reduced compared to the non-attenuated hS ARS or variant 
thereof. 

20 In certain embodiments, the attenuated virus of the invention is capable of 

infecting a host, or is capable of replicating in a host such that infectious viral particles 
are produced. In comparison to the wild-type strain, however, the attenuated strain grows 
to lower titers or grows more slowly. Any technique known to the skilled artisan can be 
used to determine the growth curve of the attenuated virus and compare it to the growth 

25 curve of the wild-type virus. 

In certain embodiments, the attenuated virus of the invention cannot replicate in 
human cells as well as the wild-type virus does. However, the attenuated virus can 
replicate well in a cell line that lack interferon functions, such as Vero cells. 

In other embodiments, the attenuated virus of the invention is capable of infecting 

30 a host, of replicating in the host, and of causing proteins of the virus of the invention to 
be inserted into the cytoplasmic membrane, but the attenuated virus does not cause the 
host to produce new infectious viral particles. In certain embodiments, the attenuated 
virus infects the host, replicates in the host, and causes viral proteins to be inserted in the 
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cytoplasmic membrane of the host with the same efficiency as the wild-type hSARS virus. 
In other embodiments, the ability of the attenuated virus to cause viral proteins to be 
inserted into the cytoplasmic membrane into the host cell is reduced compared to the 
wild-type virus. In certain embodiments, the ability of the attenuated hSARS virus to 
5 replicate in the host is reduced compared to the wild-type virus. Any technique known to 
the skilled artisan can be used to determine whether a virus is capable of infecting a 
mammalian cell, of replicating within the host, and of causing viral proteins to be inserted 
into the cytoplasmic membrane of the host. 

In certain embodiments, the attenuated virus of the invention is capable of 

10 infecting a host. In contrast to the wild-type hSARS virus, however, the attenuated 

hSARS virus cannot be replicated in the host. In a specific embodiment, the attenuated 
hS ARS virus can infect a host and can cause the host to insert viral proteins in its 
cytoplasmic membranes, but the attenuated virus is incapable of being replicated in the 
host. Any method known to the skilled artisan can be used to test whether the attenuated 

1 5 hS ARS virus has infected the host and has caused the host to insert viral proteins in its 
cytoplasmic membranes. 

In certain embodiments, the ability of the attenuated virus to infect a host is 
reduced compared to the ability of the wild-type virus to infect the same host. Any 
technique known to the skilled artisan can be used to determine whether a virus is 

20 capable of infecting a host. 

In certain embodiments, mutations (e.g., missense mutations) are introduced into 
the genome of the virus, for example, into the nucleic acid sequence of SEQ ID NO: 1,11, 
13, 15, 16, 240, 737, 1108, 1590, 1965, 2471, 2472, 2473, 2474, 2475 or 2476, or to 
generate a virus with an attenuated phenotype. Mutations (e.g., missense mutations) can 

25 be introduced into the structural genes and/or regulatory genes of the hSARS virus. 
Mutations can be additions, substitutions, deletions, or combinations thereof. Such 
variant of hSARS virus can be screened for a predicted functionality, such as infectivity, 
replication ability, protein synthesis ability, assembling ability, as well as cytopathic 
effect in cell cultures. In a specific embodiment, the missense mutation is a cold- 

30 sensitive mutation. In another embodiment, the missense mutation is a heat-sensitive 
mutation. In another embodiment, the missense mutation prevents a normal processing 
or cleavage of the viral proteins. 
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In other embodiments, deletions are introduced into the genome of the hS ARS 
virus, which result in the attenuation of the virus. 

In certain embodiments, attenuation of the virus is achieved by replacing a gene 
of the wild-type virus with a gene of a virus of a different species, of a different subgroup, 
or of a different variant. In another aspect, attenuation of the virus is achieved by 
replacing one or more specific domains of a protein of the wild-type virus with domains 
derived from the corresponding protein of a virus of a different species. In certain other 
embodiments, attenuation of the virus is achieved by deleting one or more specific 
domains of a protein of the wild-type virus. 

When a live attenuated vaccine is used, its safety must also be considered. The 
vaccine must not cause disease. Any techniques known in the art that can make a vaccine 
safe may be used in the present invention. In addition to attenuation techniques, other 
techniques may be used. One non-limiting example is to use a soluble heterologous gene 
that cannot be incorporated into the virion membrane. For example, a single copy of the 
soluble version of a viral transmembrane protein lacking the transmembrane and 
cytosolic domains thereof, can be used. 

Various assays can be used to test the safety of a vaccine. For example, sucrose 
gradients and neutralization assays can be used to test the safety. A sucrose gradient 
assay can be used to determine whether a heterologous protein is inserted in a virion. If 
the heterologous protein is inserted in the virion, the virion should be tested for its ability 
to cause symptoms in an appropriate animal model since the virus may have acquired 
new, possibly pathological, properties. 

5.3.2. Formulation of vaccines 

The invention provides vaccine formulations for the prevention and treatment of 
infections with hSARS virus. In certain embodiments, the vaccine of the invention 
comprises recombinant and chimeric viruses of the hSARS virus. In certain 
embodiments, the virus is attenuated, inactivated, or killed. 

In another embodiment of this aspect of the invention, inactivated vaccine 
formulations may be prepared using conventional techniques to "kill" the chimeric 
viruses. Inactivated vaccines are "dead" in the sense that their infectivity has been 
destroyed. Ideally, the infectivity of the virus is destroyed without affecting its 
immunogenicity. In order to prepare inactivated vaccines, the chimeric virus may be 
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grown in cell culture or in the allantois of the chick embryo, purified by zonal 

ultracentrifiigation, inactivated by formaldehyde or p-propiolactone, and pooled. The 

resulting vaccine is usually inoculated intramuscularly. 

Inactivated viruses may be formulated with a suitable adjuvant in order to 
5 enhance the immunological response. Such adjuvants may include but are not limited to 

mineral gels, e.g., aluminum hydroxide; surface active substances such as lysolecithin, 

pluronic polyols, polyanions; peptides; oil emulsions; and potentially useful human 

adjuvants such as BCG and Cotynehacterium parvum. 

The vaccines of the invention may be multivalent or univalent. Multivalent 
10 vaccines are made from recombinant viruses that direct the expression of more than one 

antigen. 

In another aspect, the present invention also provides DNA vaccine formulations 
comprising a nucleic acid or fragment of the hSARS virus, e.g., the virus having 
accession no. CCTCC-V200303, or nucleic acid molecules having the sequence of SEQ 

15 ID NO: 1, 11, 13, 15, 16, 240, 737, 1108, 1590, 1965, 2471, 2472, 2473, 2474, 2475 or 
2476, or a complement, analog, derivative, or fragment thereof, or a portion thereof. In 
another specific embodiment, the DNA vaccine formulations of the present invention 
comprises a nucleic acid or fragment thereof encoding the antibodies which 
immunospecifically binds hSARS viruses. In DNA vaccine formulations, a vaccine 

20 DNA comprises a viral vector, such as that derived from the hS ARS virus, bacterial 

plasmid, or other expression vector, bearing an insert comprising a nucleic acid molecule 
of the present invention operably linked to one or more control elements, thereby 
allowing expression of the vaccinating proteins encoded by said nucleic acid molecule in 
a vaccinated subject. Such vectors can be prepared by recombinant DNA technology as 

25 recombinant or chimeric viral vectors carrying a nucleic acid molecule of the present 
invention (see also Section 5.1, supra). 

Various heterologous vectors are described for DNA vaccinations against viral 
infections. For example, the vectors described in the following references may be used to 
express hSARS sequences instead of the sequences of the viruses or other pathogens 

30 described; in particular, vectors described for hepatitis B virus (Michel, MX. et al 7 1995, 
DAN- mediated immunization to the hepatitis B surface antigen in mice: Aspects of the 
humoral response mimic hepatitis B viral infection in humans, Proc. Natl Aca. Sci. USA 
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92:5307-53 11; Davis, HX. et al, 1993, DNA-based immunization induces continuous 
seretion of hepatitis B surface antigen and high levels of circulating antibody, Human 
Molec. Genetics 2:1847-1851), HIV vims (Wang, B. etal, 1993, Gene inoculation 
generates immune responses against human imunodeficiency virus type 1, Proc. Natl 
5 Acad. Sci. USA 90:4156-4160; Lu, 8. et al, 1996, Simian immunodeficiency virus DNA 
vaccine trial in macques, J. Virol 70:3978-3991; Letvin, NX. et al, 1997, Potent, 
protective anti-HIV immune responses generated by bimodal HIV envelope DNA plus 
protein vaccination, Proc Natl Acad Sci USA. 94(17):9378-83), and influenza viruses 
(Robinson, HL et al, 1993, Protection against a lethal influenza virus challenge by 

10 immunization with a haemagglutinin-expressing plasmid DNA, Vaccine 1 1 :957-960; 
Ulmer, J.B. et al, Heterologous protection against influenza by injection of DNA 
encoding a viral protein, Science 259:1745-1749), as well as bacterial infections, such as 
tuberculosis (Tascon, R.E. et al, 1996, Vaccination against tuberculosis by DNA 
injection, Nature Med. 2:888-892; Huygen, K. et al, 1996, Immunogenicity and 

15 protective efficacy of a tuberculosis DNA vaccine, Nature Med, 2:893-898), and 

parasitic infection, such as malaria (Sedegah, M., 1994, Protection against malaria by 
immunization with plasmid DNA encoding circumsporozoite protein, Proc. Natl. Acad. 
Sci. USA 91:9866-9870; Doolan, D.L et al, 1996, Circumventing genetic restriction of 
protection against malaria with multigene DNA immunization: CD8+ T cell-interferon 5, 

20 and nitric oxide-dependent immunity, J. Exper. Med, 1183:1739-1746). 

Many methods may be used to introduce the vaccine formulations described 
above. These include, but are not limited to, oral, intradermal, intramuscular, 
intraperitoneal, intravenous, subcutaneous, intranasal routes, and via scarification 
(scratching through the top layers of skin, e.g., using a bifurcated needle). 

25 Alternatively, it may be preferable to introduce the chimeric virus vaccine 

formulation via the natural route of infection of the pathogen for which the vaccine is 
designed. The DNA vaccines of the present invention may be administered in saline 
solutions by injections into muscle or skin using a syringe and needle (Wolff J. A. et al, 
1990, Direct gene transfer into mouse muscle in vivo, Science 247:1465-1468; Raz, E., 

30 1994, Intradermal gene immunization: The possible role of DNA uptake in the induction 
of cellular immunity to viruses, Proc. Natl Acd. Sci. USA 91:9519-9523). Another way 
to administer DNA vaccines is called "gene gun" method, whereby microscopic gold 
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beads coated with the DNA molecules of interest is fired into the cells (Tang, D. et al, 
1992, Genetic immunization is a simple method for eliciting an immune response, Nature 
356: 152-154). For general reviews of the methods for DNA vaccines, see Robinson, 
H.L., 1999, DNA vaccines: basic mechanism and immune responses (Review), Int. J. 
5 Mol Med, 4(5):549-555; Barber, B., 1997, Introduction: Emerging vaccine strategies, 
Seminars in Immunology 9(5):269-270; and Robinson, H.L. et aL, 1997, DNA vaccines, 
Seminars in Immunology 9(5): 27 1-283. 

The patient to which the vaccine is administered is preferably a mammal, most 
preferably a human, but can also be a non-human animal including but not limited to 
10 cows, horses, sheep, pigs, fowl (e.g., chickens), goats, cats, dogs, hamsters, mice and rats. 

5.3,3, Adjuvants and carriers molecules 

In certain embodiments, hSARS-associated antigens are administered with one or 
more adjuvants. In one embodiment, the hSARS-associated antigen is administered 
together with a mineral salt adjuvants or mineral salt gel adjuvant. Such mineral salt and 

1 5 mineral salt gel adjuvants include, but are not limited to, aluminum hydroxide 
(ALHYDROGEL, REHYDRAGEL), aluminum phosphate gel, aluminum 
hydroxyphosphate (ADJU-PHOS), and calcium phosphate. 

In another embodiment, hSARS-associated antigen is administered with an 
immunostimulatory adjuvant. Such class of adjuvants, include, but are not limited to, 

20 cytokines (e.g., interleukin-2, interleukin-7, interleukin-12, granulocyte-macrophage 
colony stimulating factor (GM-CSF), interfereon-y interleukin-lp (IL-lp), and IL-lp 
peptide or Sclavo Peptide), cytokine-containing liposomes, triterpenoid glycosides or 
saponins (e.g., QuilA and QS-21, also sold under the trademark STIMULON, 
ISCOPREP), Muramyl Dipeptide (MDP) derivatives, such as N-acetyl-muramyl-L- 

25 threonyl-D-isoglutamine (Threonyl-MDP, sold under the trademark TERMURTIDE), 

GMDP, N-acetyl-nor-muramyl-L-alanyl-D-isoglutamine, N-acetylmuramyl-L-alanyl-D- 
isoglutaminyl-L-alanine-2-(r-2'-dipalmitoyl-sn-glycero-3-hydroxy phosphoryloxy)- 
ethylamine, muramyl tripeptide pho sphatidy lethanolamine (MTP-PE), unmethylated CpG 
dinucleotides and oligonucleotides, such as bacterial DNA and fragments thereof, LPS, 

30 monophosphoryl Lipid A (3D-MLA sold under the trademark MPL), and 
polyphosphazenes. 

40 



WO 2004/085455 



PCT/CN2004/000247 



In another embodiment, the adjuvant used is a particular adjuvant, including, but 
not limited to, emulsions, e.g., Freund's Complete Adjuvant, Freund's Incomplete 
Adjuvant, squalene or squalane oil-in-wa.ter adjuvant formulations, such as S AF and 
MF59, e.g., prepared with block-copolymers, such as L-121 
5 (polyoxypropylene/polyoxyetheylene) sold under the trademark PLURONIC L-121, 
Liposomes, Virosomes, cochleates, and immune stimulating complex, which is sold 
under the trademark ISCOM. 

In another embodment, a microp articular adjuvant is used, microparticulare 
adjuvants include, but are not limited to biodegradable and biocompatible polyesters, 
10 homo- and copolymers of lactic acid (PLA) and glycolic acid (PGA), poly(lactide-co- 
glycolides) (PLGA) microparticles, polymers that self-associate into particulates 
(poloxamer particles), soluble polymers (polyphosphazenes), and virus-like particles 
(VLPs) such as recombinant protein particulates, e.g., hepatitis B surface antigen 
(HbsAg). 

15 Yet another class of adjuvants that may be used include mucosal adjuvants, 

including but not limited to heat-labile enterotoxin from Escherichia coli (LT), cholera 
holotoxin (CT) and cholera Toxin B Subunit (CTB) from Vibrio cholerae, mutant toxins 
(e.g., LTK63 and LTR72), microparticles, and polymerized liposomes. 

In other embodiments, any of the above classes of adjuvants may be used in 

20 combination with each other or with other adjuvants. For example, non-limiting 
examples of combination adjuvant preparations that can be used to administer the 
hSARS-associated antigens of the invention include liposomes containing 
immunostimulatory protein, cytokines, or T-cell and/or B-cell peptides, or microbes with 
or without entrapped IL-2 or microparticles containing enterotoxin. Other adjuvants 

25 known in the art are also included within the scope of the invention (see Vaccine Design: 
The Subunit and Adjuvant Approach, Chap. 7, Michael F. Powell and Mark J. Newman 
(eds.), Plenum Press, New York, 1995, which is incorporated herein by reference in its 
entirety). 

The effectiveness of an adjuvant may be determined by measuring the induction 
30 of antibodies directed against an immunogenic polypeptide containing an hSARS 

polypeptide epitope, the antibodies resulting from administration of this polypeptide in 
vaccines which are also comprised of the various adjuvants. 
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The polypeptides may be formulated into the vaccine as neutral or salt forms. 
Pharmaceutically acceptable salts include the acid additional salts (formed with free 
amino groups of the peptide) and which are formed with inorganic acids, such as, for 
example, hydrochloric or phosphoric acids, or organic acids such as acetic, oxalic, tartaric, 
maleic, and the like. Salts formed with free carboxyl groups may also be derived from 
inorganic bases, such as, for example, sodium potassium, ammonium, calcium, or ferric 
hydroxides, and such organic bases as isopropylamine, trimethylamine, 2-ethylamino 
ethanol, histidine, procaine and the like. 

5,4. Preparation of Antibodies 

Antibodies can be isolated from the serum of a subject infected with SARS. 
Antibodies which specifically recognize a polypeptide of the invention, such as, but not 
limited to, polypeptides comprising the sequence of SEQ ID NO: 2, 12 or 14, or 
polypeptides as shown in Figures 11 (SEQ ID NOS: 17-239, 241-736 and 738-1107) and 
12 (SEQ ED NOS: 1109-1589, 1591-1964 and 1966-2470), orhSARS epitope or antigen- 
binding fragments thereof can be used for detecting, screening, and isolating the 
polypeptide of the invention or fragments thereof, or similar sequences that might encode 
similar enzymes from the other organisms. For example, in one specific embodiment, an 
antibody which immunospecifically binds hSARS epitope, or a fragment thereof, can be 
used for various in vitro detection assays, including enzyme-linked immunosorbent 
assays (ELISA), radioimmunoassays, Western blot, etc., for the detection of a 
polypeptide of the invention or, preferably, polypeptides of the hSARS virus, in samples, 
for example, a biological material, including cells, cell culture media (e.g., bacterial cell 
culture media, mammalian cell culture media, insect cell culture media, yeast cell culture 
media, etc.), blood, serum, plasma, saliva, urine, stool, tissues, sputum, nasopharyngeal 
aspirates, etc. 

Antibodies specific for a polypeptide of the invention or any epitope of hSARS 
virus may be generated by any suitable method known in the art. Polyclonal antibodies 
to an antigen-of-interest, for example, the hSARS virus from deposit no. CCTCO 
V200303, or which comprises a nucleic acid sequence of SEQ ID NO: 15, can be 
produced by various procedures well known in the art. For example, an antigen can be 
administered to various host animals including, but not limited to, rabbits, mice, rats, etc., 
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to induce the production of antisera containing polyclonal antibodies specific for the 
antigen. Various adjuvants may be used to increase the immunological response, 
depending on the host species, and include but are not limited to, Freund's (complete and 
incomplete) adjuvant, mineral gels such as aluminum hydroxide, surface active 
substances such as lysoleeithin, pluronic polyols, polyanions, peptides, oil emulsions, 
keyhole limpet hemocyanins, dinitrophenol, and potentially useful adjuvants for humans 
such as BCG (Bacille Calmette-Guerin) and Corynebacterium parvum. Such adjuvants 
are also well known in the art. 

Monoclonal antibodies can be prepared using a wide variety of techniques known 
in the art including the use of hybridoma, recombinant, and phage display technologies, 
or a combination thereof. For example, monoclonal antibodies can be produced using 
hybridoma techniques including those known in the art and taught, for example, in 
Harlow et al, Antibodies: A Laboratory Manual, (Cold Spring Harbor Laboratory Press, 
2nd ed. 1988); Hammerling et aL, in: Monoclonal Antibodies and T-Cell Hybridomas, 
pp. 563-681 (Elsevier, N.Y., 1981) (both of which are incorporated herein by reference in 
their entireties). The term "monoclonal antibody" as used herein is not limited to 
antibodies produced through hybridoma technology. The term "monoclonal antibody" 
refers to an antibody that is derived from a single clone, including any eukaryotic, 
prokaryotic, or phage clone, and not the method by which it is produced. 

Methods for producing and screening for specific antibodies using hybridoma 
technology are routine and well known in the art. In a non-limiting example, mice can be 
immunized with an antigen of interest or a cell expressing such an antigen. Once an 
immune response is detected, e.g., antibodies specific for the antigen are detected in the 
mouse serum, the mouse spleen is harvested and splenocytes isolated. The splenocytes 
are then fused by well known techniques to any suitable myeloma cells. Hybridomas are 
selected and cloned by limiting dilution. The hybridoma clones are then assayed by 
methods known in the art for cells that secrete antibodies capable of binding the antigen. 
Ascites fluid, which generally contains high levels of antibodies, can be generated by 
inoculating mice intraperitoneally with positive hybridoma clones. 

Antibody fragments which recognize specific epitopes may be generated by 
known techniques. For example, Fab and F(ab') 2 fragments may be produced by 
proteolytic cleavage of immunoglobulin molecules, using enzymes such as papain (to 
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produce Fab fragments) or pepsin (to produce F(ab') 2 fragments). F(ab') 2 fragments 
contain the complete light chain, and the variable region, the CHI region and the hinge 
region of the heavy chain. 

The antibodies of the invention or fragments thereof can be also produced by any 
5 method known in the art for the synthesis of antibodies, in particular, by chemical 
synthesis or preferably, by recombinant expression techniques. 

The nucleotide sequence encoding an antibody may be obtained from any 
information available to those skilled in the art (i.e., from Genbank, the literature, or by 
routine cloning and sequence analysis). If a clone containing a nucleic acid encoding a 

10 particular antibody or an epitope-binding fragment thereof is not available, but the 
sequence of the antibody molecule or epitope-binding fragment thereof is known, a 
nucleic acid encoding the immunoglobulin may be chemically synthesized or obtained 
from a suitable source (e.g., an antibody cDNA library, or a cDNA library generated from, 
or nucleic acid, preferably poly A+ RNA, isolated from any tissue or cells expressing the 

15 antibody, such as hybridoma cells selected to express an antibody) by PGR amplification 
using synthetic primers hybridizable to the 3' and 5' ends of the sequence or by cloning 
using an oligonucleotide probe specific for the particular gene sequence to identify, e.g., 
a cDNA clone from a cDNA library that encodes the antibody. Amplified nucleic acids 
generated by PGR may then be cloned into replicable cloning vectors using any method 

20 well known in the art. 

Once the nucleotide sequence of the antibody is determined, the nucleotide 
sequence of the antibody may be manipulated using methods well known in the art for 
the manipulation of nucleotide sequences, e.g., recombinant DNA techniques, site 
directed mutagenesis, PGR, etc. (see, for example, the techniques described in Sambrook 

25 et al., supra; and Ausubel et al., eds., 1998, Current Protocols in Molecular Biology, John 
Wiley & Sons, NY, which are both incorporated by reference herein in their entireties), to 
generate antibodies having a different amino acid sequence by, for example, introducing 
amino acid substitutions, deletions, and/or insertions into the epitope-binding domain 
regions of the antibodies or any portion of antibodies which may enhance or reduce 

30 biological activities of the antibodies. 

Recombinant expression of an antibody requires construction of an expression 
vector containing a nucleotide sequence that encodes the antibody. Once a nucleotide 
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sequence encoding an antibody molecule or a heavy or light chain of an antibody, or 
portion thereof has been obtained, the vector for the production of the antibody molecule 
may be produced by recombinant DNA technology using techniques well known in the 
art as discussed in the previous sections. Methods which are well known to those skilled 
in the art can be used to construct expression vectors containing antibody coding 
sequences and appropriate transcriptional and translation^ control signals. These 
methods include, for example, in vitro recombinant DNA techniques, synthetic 
techniques, and in vivo genetic recombination. The nucleotide sequence encoding the 
heavy-chain variable region, light-chain variable region, both the heavy-chain and light- 
chain variable regions, an epitope-binding fragment of the heavy- and/or light-chain 
variable region, or one or more complementarity determining regions (CDRs) of an 
antibody may be cloned into such a vector for expression. Thus-prepared expression 
vector can be then introduced into appropriate host cells for the expression of the 
antibody. Accordingly, the invention includes host cells containing a polynucleotide 
encoding an antibody specific for the polypeptides of the invention or fragments thereof 

The host cell may be co-transfected with two expression vectors of the invention, 
the first vector encoding a heavy chain derived polypeptide and the second vector 
encoding a light chain derived polypeptide. The two vectors may contain identical 
selectable markers which enable equal expression of heavy and light chain polypeptides 
or different selectable markers to ensure maintenance of both plasmids. Alternatively, a 
single vector may be used which encodes, and is capable of expressing, both heavy and 
light chain polypeptides. In such situations, the light chain should be placed before the 
heavy chain to avoid an excess of toxic free heavy chain (Proudfoot, 1986, Nature 322:52; 
and Kohler, 1980, Proc. Natl. Acad Set U.S.A. 77:2 197). The coding sequences for the 
heavy and light chains may comprise cDNA or genomic DNA. 

In another embodiment, antibodies can also be generated using various phage 
display methods known in the art. In phage display methods, functional antibody 
domains are displayed on the surface of phage particles which carry the polynucleotide 
sequences encoding them. In a particular embodiment, such phage can be utilized to 
display antigen binding domains, such as Fab and Fv or disulfide-bond stabilized Fv, 
expressed from a repertoire or combinatorial antibody library (e.g., human or murine). 
Phage expressing an antigen binding domain that binds the antigen of interest can be 
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selected or identified with antigen, e.g., using labeled antigen or antigen bound or 
captured to a solid surface or bead. Phage used in these methods are typically 
filamentous phage, including fd and M13. The antigen binding domains are expressed as 
a recombinantly fused protein to either the phage gene III or gene VIII protein. Examples 
5 of phage display methods that can be used to make the immunoglobulins, or fragments 
thereof, of the present invention include those disclosed in Brinkman et al, 1995, J. 
Immunol Methods 182:41-50; Ames et al, 1995, J. Immunol Methods 184:177-186; 
Kettleborough et al, 1994, Eur. J. Immunol 24:952-958; Persic et al, 1997, Gene 187:9- 
18; Burton et al, 1994, Advances in Immunology 57:191-280; PCT application No. 

10 PCT/GB91/01 134; PCT publications WO 90/02809; WO 91/10737; WO 92/01047; WO 
92/18619; WO 93/11236; WO 95/15982; WO 95/20401; and U.S. Patent Nos. 5,698,426; 
5,223,409; 5,403,484; 5,580,717; 5,427,908; 5,750,753; 5,821,047; 5,571,698; 5,427,908; 
5,516,637; 5,780,225; 5,658,727; 5,733,743 and 5,969,108; each of which is incorporated 
herein by reference in its entirety. 

15 As described in the above references, after phage selection, the antibody coding 

regions from the phage can be isolated and used to generate whole antibodies, including 
human antibodies, or any other desired fragments, and expressed in any desired host, 
including mammalian cells, insect cells, plant cells, yeast, and bacteria, e.g., as described 
in detail below. For example, techniques to recombinantly produce Fab, Fab' and 

20 F(abd)2 fragments can also be employed using methods known in the art such as those 
disclosed in PCT publication WO 92/22324; Mullinax et al, 1992, BioTechniques, 
12(6):864-869; and Sawai et al., AJRI, 34:26-34, 1995; and Better et al, 1988, Science 
240: 1041-1043 (each of which is incorporated herein by reference in its entirety). 
Examples of techniques which can be used to produce single-chain Fvs and antibodies 

25 include those described in U.S. Patent Nos. 4,946,778 and 5,258,498; Huston et al, 1991, 
Methods in Enzynnology 203:46-88; Shu et al, 1993, PNAS 90:7995-7999; and Skerra et 
al, 1988, Science, 240:1038-1040. 

Once an antibody molecule of the invention has been produced by any methods 
described above, it may then be purified by any method known in the art for purification 

30 of an immunoglobulin molecule, for example, by chromatography (e.g., ion exchange, 
affinity, particularly by affinity for the specific antigen after Protein A or Protein G 
purification, and sizing column chromatography), centrifugation, differential solubility, 
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or by any other standard techniques for the purification of proteins. Further, the 
antibodies of the present invention or fragments thereof may be fused to heterologous 
polypeptide sequences described herein or otherwise known in the art to facilitate 
purification. 

For some uses, including in vivo use of antibodies in humans and in vitro 
detection assays, it may be preferable to use chimeric, humanized, or human antibodies. 
A chimeric antibody is a molecule in which different portions of the antibody are derived 
from different animal species, such as antibodies having a variable region derived from a 
murine monoclonal antibody and a constant region derived from a human 
immunoglobulin. Methods for producing chimeric antibodies are known in the art. See 
e.g., Morrison, 1985, Science, 229:1202; Oi et al, 1986, BioTechniques 4:214; Gillies et 
al, 1989, J. Immunol Methods 125:191-202; U.S. Patent Nos. 5,807,715; 4,816,567; and 
4,8 16,397, which are incorporated herein by reference in their entireties. Humanized 
antibodies are antibody molecules from non-human species that bind the desired antigen 
having one or more complementarity determining regions (CDRs) from the non-human 
species and framework regions from a human immunoglobulin molecule. Often, 
framework residues in the human framework regions will be substituted with the 
corresponding residue from the CDR donor antibody to alter, preferably improve, antigen 
binding. These framework substitutions are identified by methods well known in the art, 
e.g., by modeling of the interactions of the CDR and framework residues to identify 
framework residues important for antigen binding and sequence comparison to identify 
unusual framework residues at particular positions. See, e.g., Queen et al, U.S. Patent 
No. 5,585,089; Riechmann et al, 1988, Nature 332:323, which are incorporated herein 
by reference in their entireties. Antibodies can be humanized using a variety of 
techniques known in the art including, for example, CDR-grafting (EP 239,400; PCT 
publication WO 91/09967; U.S. Patent Nos. 5,225,539; 5,530,101 and 5,585,089), 
veneering or resurfacing (EP 592,106; EP 519,596; Padlan, 1991, Molecular Immunology 
28(4/5):489-498; Studnicka et al, 1994, Protein Engineering 7(6):805-814; Roguska et 
al, 1994, Proc Natl Acad Set U.S.A. 91:969-973), and chain shuffling (U.S. Patent No. 
5,565,332), all of which are hereby incorporated by reference in their entireties. 

Completely human antibodies are particularly desirable for therapeutic treatment 
of human patients. Human antibodies can be made by a variety of methods known in the 
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art including phage display methods described above using antibody libraries derived 
from human immunoglobulin sequences. See U.S. Patent Nos. 4,444,887 and 4,716,1 1 1; 
and PCT publications WO 98/46645; WO 98/50433; WO 98/24893; WO 98/16654; WO 
96/34096; WO 96/33735; and WO 91/10741, each of which is incorporated herein by 
5 reference in its entirety. 

Human antibodies can also be produced using transgenic mice which are 
incapable of expressing functional endogenous immunoglobulins, but which can express 
human immunoglobulin genes. For an overview of this technology for producing human 
antibodies, see Lonberg and Huszar, 1995, Int. Rev. Immunol. 13:65-93. For a detailed 

10 discussion of this technology for producing human antibodies and human monoclonal 
antibodies and protocols for producing such antibodies, see, e.g., PCT publications WO 
98/24893; WO 92/01047; WO 96/34096; WO 96/33735; European Patent No. 0 598 877; 
U.S. Patent Nos. 5,413,923; 5,625,126; 5,633,425; 5,569,825; 5,661,016; 5,545,806; 
5,814,318; 5,885,793; 5,916,771; and 5,939,598, which are incorporated by reference 

15 herein in their entireties. In addition, companies such as Abgenix, Inc. (Fremont, CA), 
Medarex (NJ) and Genpharm (San Jose, CA) can be engaged to provide human 
antibodies directed against a selected antigen using technology similar to that described 
above. 

Completely human antibodies which recognize a selected epitope can be 
20 generated using a technique referred to as "guided selection." In this approach a selected 
non-human monoclonal antibody, e.g., a mouse antibody, is used to guide the selection of 
a completely human antibody recognizing the same epitope. (Jespers et al, 1988, 
Bio/technology 12:899-903). 

Antibodies fused or conjugated to heterologous polypeptides may be used in in 
25 vitro immunoassays and in purification methods (e.g., affinity chromatography) well 
known in the art. See e.g., PCT publication Number WO 93/21232; EP 439,095; 
Naramura et al, 1994, Immunol. Lett. 39:91-99; U.S. Patent 5,474,981; Gillies et al, 
1992, PNAS 89: 1428-1432; and Fell et al, 1991, J. Immunol 146:2446-2452, which are 
incorporated herein by reference in their entireties. 
30 Antibodies may also be attached to solid supports, which are particularly useful 

for immunoassays or purification of the polypeptides of the invention or fragments, 
derivatives, analogs, or variants thereof, or similar molecules having the similar 
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enzymatic activities as the polypeptide of the invention. Such solid supports include, but 
are not limited to, glass, cellulose, polyacrylamide, nylon, polystyrene, polyvinyl chloride 
or polypropylene. 

5.5. Pharmaceutical Compositions and Kits 

5 The present invention encompasses pharmaceutical compositions comprising anti- 

viral agents of the present invention. In a specific embodiment, the anti- viral agent is an 
antibody which immuno specifically binds and neutralize the hSARS virus, natural or 
artificial variants, analogs, or derivatives thereof, or any proteins derived therefrom. The 
virus neutralizing antibody neutralizes the infectivity of the virus and protects an animal 

10 against disease when wild-type virus is subsequently administered to the animal. 

In another specific embodiment, the anti- viral agent is a polypeptide or nucleic 
acid molecule of the invention. The pharmaceutical compositions have utility as an anti- 
viral prophylactic agent and may be administered to a subject where the subject has been 
exposed or is expected to be exposed to a virus. 

15 Various delivery systems are known and can be used to administer the 

pharmaceutical composition of the invention, e.g., encapsulation in liposomes, 
microparticles, microcapsules, recombinant cells capable of expressing the mutant 
viruses, receptor mediated endocytosis (see, e.g., Wu and Wu, 1987, J. Biol. Chem. 
262:4429 4432). Methods of introduction include but are not limited to intradermal, 

20 intramuscular, intraperitoneal, intravenous, subcutaneous, intranasal, epidural, 

scarification, and oral routes. The compounds may be administered by any convenient 
route, for example by infusion or bolus injection, by absorption through epithelial or 
mucocutaneous linings (e.g., oral mucosa, rectal and intestinal mucosa, etc.) and may be 
administered together with other biologically active agents. Administration can be 

25 systemic or local. In a preferred embodiment, it may be desirable to introduce the 
pharmaceutical compositions of the invention into the lungs by any suitable route. 
Pulmonary administration can also be employed, e.g., by use of an inhaler or nebulizer, 
and formulation with an aerosolizing agent. 

In a specific embodiment, it may be desirable to administer the pharmaceutical 

30 compositions of the invention locally to the area in need of treatment; this may be 

achieved by, for example, and not by way of limitation, local infusion during surgery, 
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topical application, e.g., in conjunction with a wound dressing after surgery, by injection, 
by means of a catheter, by means of a suppository, or by means of an implant, said 
implant being of a porous, non porous, or gelatinous material, including membranes, such 
as sialastic membranes, or fibers. In one embodiment, administration can be by direct 
5 injection at the site (or former site) infected tissues. 

In another embodiment, the pharmaceutical composition can be delivered in a 
vesicle, in particular a liposome (see Langer, 1990, Science 249:1527-1533; Treat et al, 
in Liposomes in the Therapy of Infectious Disease and Cancer, Lopez Berestein and 
Fidler (eds.), Liss, New York, pp.353-365 (1989); Lopez-Berestein, ibid., pp.3 17-327; 

10 see generally ibid). 

In yet another embodiment, the pharmaceutical composition can be delivered in a 
controlled release system. In one embodiment, a pump may be used (see Langer, supra; 
Sefton, 1987, CRC Crit Ref. Biomed. Eng. 14:201; Buchwald etal.,1980, Surgery 88:507; 
and Saudek et al, 1989, K Engl J. Med. 321:574). In another embodiment, polymeric 

15 materials can be used (see Medical Applications of Controlled Release, Langer and Wise 
(eds.), CRC Pres., Boca Raton, Florida (1974); Controlled Drug Bioavailability, Drug 
Product Design and Performance, Smolen and Ball (eds.), Wiley, New York (1984); 
Ranger and Peppas, 1983, Macromol Scl Rev. Macromol Chem. 23:61; see also Levy 
etal, 1985, Science 228:190; During etal, \9%9 7 Ann. Neurol 25:351; Howard etal, 

20 1989, J. Neurosurg. 71 : 105). In yet another embodiment, a controlled release system can 
be placed in proximity of the composition's target, i.e., the lung, thus requiring only a 
fraction of the systemic dose (see, e.g., Goodson, in Medical Applications of Controlled 
Release, supra, vol. 2, pp. 115-138 (1984)). 

Other controlled release systems are discussed in the review by Langer (1990, 

25 sScience 249:1527-1533). 

The pharmaceutical compositions of the present invention comprise a 
therapeutically effective amount of a live attenuated, inactivated or killed hSARS virus, 
or recombinant or chimeric hSARS virus, and a pharmaceutical^ acceptable carrier. In a 
specific embodiment, the term "pharmaceutically acceptable" means approved by a 

30 regulatory agency of the Federal or a state government or listed in the U.S. Pharmacopeia 
or other generally recognized pharmacopeia for use in animals, and more particularly in 
humans. The term "carrier" refers to a diluent, adjuvant, excipient, or vehicle with which 
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the pharmaceutical composition is administered. Such pharmaceutical carriers can be 
sterile liquids, such as water and oils, including those of petroleum, animal, vegetable or 
synthetic origin, such as peanut oil, soybean oil, mineral oil, sesame oil and the like. 
Water is a preferred carrier when the pharmaceutical composition is administered 
5 intravenously. Saline solutions and aqueous dextrose and glycerol solutions can also be 
employed as liquid carriers, particularly for injectable solutions. Suitable pharmaceutical 
excipients include starch, glucose, lactose, sucrose, gelatin, malt, rice, flour, chalk, silica 
gel, sodium stearate, glycerol monostearate, talc, sodium chloride, dried skim milk, 
glycerol, propylene, glycol, water, ethanol and the like. The composition, if desired, can 

10 also contain minor amounts of wetting or emulsifying agents, or pH buffering agents. 
These compositions can take the form of solutions, suspensions, emulsion, tablets, pills, 
capsules, powders, sustained release formulations and the like. The composition can be 
formulated as a suppository, with traditional binders and carriers such as triglycerides. 
Oral formulation can include standard carriers such as pharmaceutical grades of mannitol, 

15 lactose, starch, magnesium stearate, sodium saccharine, cellulose, magnesium carbonate, 
etc. Examples of suitable pharmaceutical carriers are described in "Remington's 
Pharmaceutical Sciences" by E.W. Martin. The formulation should suit the mode of 
administration. 

In a preferred embodiment, the composition is formulated in accordance with 
20 routine procedures as a pharmaceutical composition adapted for intravenous 

administration to human beings. Typically, compositions for intravenous administration 
are solutions in sterile isotonic aqueous buffer. Where necessary, the composition may 
also include a solubilizing agent and a local anesthetic such as lignocaine to ease pain at 
the site of the injection. Generally, the ingredients are supplied either separately or 
25 mixed together in unit dosage form, for example, as a dry lyophilized powder or water 
free concentrate in a hermetically sealed container such as an ampoule or sachette 
indicating the quantity of active agent. Where the composition is to be administered by 
infusion, it can be dispensed with an infusion bottle containing sterile pharmaceutical 
grade water or saline. Where the composition is administered by injection, an ampoule of 
30 sterile water for injection or saline can be provided so that the ingredients may be mixed 
prior to administration. 
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The pharmaceutical compositions of the invention can be formulated as neutral or 
salt forms. Pharmaceutically acceptable salts include those formed with free amino 
groups such as those derived from hydrochloric, phosphoric, acetic, oxalic, tartaric acids, 
etc., and those formed with free carboxyl groups such as those derived from sodium, 
5 potassium, ammonium, calcium, ferric hydroxides, isopropylamine, triethylamine, 2 
ethylamino ethanol, histidine, procaine, etc. 

The amount of the pharmaceutical composition of the invention which will be 
effective in the treatment of a particular disorder or condition will depend on the nature 
of the disorder or condition, and can be determined by standard clinical techniques. In 
10 addition, in vitro assays may optionally be employed to help identify optimal dosage 

ranges. The precise dose to be employed in the formulation will also depend on the route 
of administration, and the seriousness of the disease or disorder, and should be decided 
according to the judgment of the practitioner and each patient's circumstances. However, 
suitable dosage ranges for intravenous administration are generally about 20 to 500 
1 5 micrograms of active compound per kilogram body weight. Suitable dosage ranges for 
intranasal administration are generally about 0.01 pg/kg body weight to 1 mg/kg body 
weight. Effective doses may be extrapolated from dose response curves derived from in 
vitro or animal model test systems. 

Suppositories generally contain active ingredient in the range of 0.5% to 10% by 
20 weight; oral formulations preferably contain 10% to 95% active ingredient. 

The invention also provides a pharmaceutical pack or kit comprising one or more 
containers filled with one or more of the ingredients of the pharmaceutical compositions 
of the invention. Optionally associated with such container(s) can be a notice in the form 
prescribed by a governmental agency regulating the manufacture, use or sale of 
25 pharmaceuticals or biological products, which notice reflects approval by the agency of 
manufacture, use or sale for human administration. In a preferred embodiment, the kit 
contains an anti-viral agent of the invention, e.g., an antibody specific for the 
polypeptides encoded by a nucleic acid sequence of SEQ ID NO:l, 1 1, 13, 15, 2471, 
2472, 2473, 2474, 2475 or 2476, or as shown in Figures 1 1 (SEQ ID NO: 17-239, 241- 
30 736 or 738-1 107) and 12 (SEQ ID NO: 1109-1589, 1591-1964 or 1966-2470), or any 
hSARS epitope, or a polypeptide or protein of the present invention, or a nucleic acid 
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molecule of the invention, alone or in combination with adjuvants, antivirals, antibiotics, 
analgesic, bronchodialaters, or other pharmaceutically acceptable excipients. 

The present invention further encompasses kits comprising a container containing 
a pharmaceutical composition of the present invention and instructions for use. 

5 5.6. Detection Assays 

The present invention provides a method for detecting an antibody, which 
immuno specifically binds to the hSARS virus, in a biological sample, for example, cells, 
blood, serum, plasma, saliva, urine, stool, sputum, nasopharyngeal aspirates, and so forth, 
from a patient suffering from SARS. In a specific embodiment, the method comprising 

10 contacting the sample with the hSARS virus, for example, of deposit no. CCTCC- 
V200303, or having a genomic nucleic acid sequence of SEQ ID NO: 15, directly 
immobilized on a substrate and detecting the virus-bound antibody directly or indirectly 
by a labeled heterologous anti-isotype antibody. In another specific embodiment, the 
sample is contacted with a host cell which is infected by the hS ARS virus, for example, 

15 of deposit no. CCTCC- V2003 03 , or having a genomic nucleic acid sequence of SEQ ID 
NO: 15, and the bound antibody can be detected by immunofluorescent assay as described 
in Section 6:5, infra. 

An exemplary method for detecting the presence or absence of a polypeptide or 
nucleic acid of the invention in a biological sample involves obtaining a biological 

20 sample from various sources and contacting the sample with a compound or an agent 
capable of detecting an epitope or nucleic acid (e.g., mRNA, genomic RNA) of the 
hSARS virus such that the presence of the hSARS virus is detected in the sample. A 
preferred agent for detecting hS ARS mRNA or genomic RNA of the invention is a 
labeled nucleic acid probe capable of hybridizing to mRNA or genomic RNA encoding a 

25 polypeptide of the invention. The nucleic acid probe can be, for example, a nucleic acid 
molecule comprising or consisting of the nucleic acid sequence of SEQ ID NO: 1, 11, 13, 
15, 16, 240, 737, 1108, 1590, 1965, 2471, 2472, 2473, 2474, 2475 or 2476, or a 
complement, analog, derivative, or fragment thereof, or a portion thereof, such as an 
oligonucleotide of at least 15, 20, 25, 30, 50, 100, 250, 500, 750, 1,000 or more 

30 contiguous nucleotides in length and sufficient to specifically hybridize under stringent 
conditions to an hS ARS mRNA or genomic RNA. 
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In another preferred specific embodiment, the presence of hSARS virus is 
detected in the sample by an reverse transcription polymerase chain reaction (RT-PCR) 
using the primers that are constructed based on a partial nucleotide sequence of the 
genome of hSARS virus, for example, that of deposit accession no. CCTCC-V200303, or 
based on a nucleic acid sequence of SEQ ID NO: 1, 11, 13, 15, 16, 240, 737, 1108, 1590 
or 1965. In a non-limiting specific embodiment preferred primers to be used in a RT- 
PCR method are: 5 ' -T ACACACCTC AGC-GTTG-3 ' (SEQ ID NO:3) and 5>- 
CACGAACGTGACG-AAT-3 3 (SEQ ID NO: 4), in the presence of 2.5 mM IvIgCU and 
the thermal cycles are, for example, but not limited to, 94°C for 8 min followed by 40 
cycles of 94°C for 1 min, 50°C for 1 min, 72°C for 1 min (also see Section 6.7, infra). 
In more preferred specific embodiment, the present invention provides a real-time 
quantitative PCR assay to detect the presence of hSARS virus in a biological sample by 
subjecting the cDNA obtained by reverse transcription of the extracted total RNA from 
the sample to PCR reactions using the specific primers, such as those having nucleic acid 
sequences of SEQ ID NOS:3 and 4, and a fluorescence dye, such as SYBR® Green I, 
which fluoresces when bound non-specifically to double-stranded DNA. In yet another 
preferred specific embodiment, the real-time quantitative PCR used in the present 
invention is a TaqMan® assay (see Section 5, supra). Specifically, the preferred primers 
to be used in a real-time quantitative PCR assay to detect the presence of hSARS virus in 
a biological sample, are those having nucleic acid sequences of SEQ ID NOS:2471 and 
2472. In this case, the amplified product is detected by a TaqMan® probe, preferably 
having a nucleotide sequence of SEQ ID NO:2473 . Another preferred primers to be used 
in a TaqMan® assay are those having nucleic acid sequences of SEQ ID NOS:2474 and 
2475 and a preferred TaqMan® probe has a nucleotide sequence of SEQ ID NO:2476. 
The fluorescence signals from these reactions are captured at the end of extension steps 
as PCR product is generated over a range of the thermal cycles, thereby allowing the 
quantitative determination of the viral load in the sample based on an amplification plot 
(see Sections 6.7 amd 6.8, infra). 

In another preferred specific embodiment, the presence of hSARS virus is 
detected in the sample using fluorescent cDNA microarray technology. An inventory of 
cDNA probes derived from the hSARS virus, for example, of deposit no. CCTCC- 
V200303, or having a genomic nucleic acid sequence of SEQ ID NO: 15, is prepared by 
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reverse transcription and amplification using appropriate primers that are constructed 
based on, for example, a partial nucleotide sequence of the genome of said hSARS virus, 
or based on anucleic acid sequence of SEQ XDNOS: 1, 11, 13, 15, 16, 240, 737, 1108, 
1590 or 1965. Thus-amplified products are then purified and immobilized onto a chip, 
5 for example, a poly-L-lysine coated glass plate as a cDNA microarray. A total RNA is 
extracted from a biological sample and subjected to reverse transcription in the presence 
of fluorescence-labeled nucleotides. The labeled cDNA representing the mRNA in the 
sample is then contacted with the immobilized cDNA probes on the microarray and the 
fluorescence signals of the bound cDNA are detected and quantified. A variety of DNA 

1 0 microaary methods have been described, for example, in Nucleic Acids Res. 28(22):4552- 
7 (by Kane, M.D. et al, 2000); Science 2000 Sep 8;289(5485): 1757-60 (by Taton, T.A. et 
al, 2000); and Nature, 405(6788):827-836 (by Lockhart, DJ. etal, 2000). 

Another preferred agent for detecting hSARS virus is an antibody that specifically 
binds a polypeptide of the invention or any hSARS epitope, preferably an antibody with a 

15 detectable label. Antibodies can be polyclonal, or more preferably, monoclonal. An 
intact antibody, or a fragment thereof (e.g., Fab or F(ab')2) can be used. 

The term "labeled", with regard to the probe or antibody, is intended to 
encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) a 
detectable substance to the probe or antibody, as well as indirect labeling of the probe or 

20 antibody by reactivity with another reagent that is directly labeled. Examples of indirect 
labeling include detection of a primary antibody using a ftuorescently labeled secondary 
antibody and end-labeling of a DNA probe with biotin such that it can be detected with 
fluorescently labeled streptavidin. The detection method of the invention can be used to 
detect mRNA, protein (or any epitope), or genomic RNA in a sample in vitro as well as 

25 in vivo. For example, in vitro techniques for detection of mRNA include northern 
hybridizations, in situ hybridizations, RT-PCR, and RNase protection. In vitro 
techniques for detection of an epitope of hSARS virus include enzyme linked 
immunosorbent assays (ELISAs), Western blots, immunoprecipitations and 
immunofluorescence. In vitro techniques for detection of genomic RNA include nothern 

30 hybridizations, RT-PCT, and RNase protection. Furthermore, in vivo techniques for 

detection of hSARS virus include introducing into a subject organism a labeled antibody 
directed against the polypeptide. For example, the antibody can be labeled with a 
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radioactive marker whose presence and location in the subject organism can be detected 
by standard imaging techniques, including autoradiography. 

In a specific embodiment, the methods further involve obtaining a control sample 
from a control subject, contacting the control sample with a compound or agent capable 
of detecting hSARS virus, e.g., a polypeptide of the invention or mRNA or genomic 
RNA encoding a polypeptide of the invention, such that the presence of hSARS virus or 
the polypeptide or mRNA or genomic RNA encoding the polypeptide is detected in the 
sample, and comparing the presence of hSARS virus or the polypeptide or mRNA or 
genomic RNA encoding the polypeptide in the control sample with the presence of 
hS ARS virus, or the polypeptide or mRNA or genomic DNA encoding the polypeptide in 
the test sample. 

In a specific embodiment, the invention provides a diagnostic kit comprising 
nucleic acid molecules which are suitable for use to detect the hS ARS virus, natural or 
artificial variants, analogs, or derivatives thereof In a specific embodiment, the nucleic 
acid molecules have the nucleic acid sequence of SEQ ID NOS:2471 and 2472. In 
specific embodiments, the nucleic acid molecule has the nucleic acid sequence of SEQ ID 
NO: 2473. In another specific embodiment, the nucleic acid molecules have the nucleic 
acid sequence of SEQ ID NOS:2474 and 2475. In specific embodiments, the nucleic acid 
molecule has the nucleic acid sequence of SEQ ID NO:2476. 

The invention also encompasses kits for detecting the presence of hSARS virus or 
a polypeptide or nucleic acid of the invention in a test sample. The kit, for example, can 
comprise a labeled compound or agent capable of detecting hSARS virus or the 
polypeptide or a nucleic acid molecule encoding the polypeptide in a test sample and, in 
certain embodiments, a means for determining the amount of the polypeptide or mRNA 
in the sample (e.g., an antibody which binds the polypeptide or an oligonucleotide probe 
which binds to DNA or mRNA encoding the polypeptide). Kits can also include 
instructions for use. 

For antibody-based kits, the kit can comprise, for example: (1) a first antibody 
(e.g., attached to a solid support) which binds to a polypeptide of the invention or an 
epitope of the hSARS virus; and, optionally, (2) a second, different antibody which binds 
to either the polypeptide or the first antibody and is conjugated to a detectable agent. 
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For oligonucleotide-based kits, the kit can comprise, for example: (1) an 
oligonucleotide, e.g., a detectably labeled oligonucleotide, which hybridizes to a nucleic 
acid sequence encoding a polypeptide of the invention or to a sequence within the hSARS 
genome or (2) a pair of primers useful for amplifying a nucleic acid molecule containing 
5 an hS ARS sequence. The kit can also comprise, e.g., a buffering agent, a preservative, or 
a protein stabilizing agent. The kit can also comprise components necessary for detecting 
the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control 
sample or a series of control samples which can be assayed and compared to the test 
sample contained. Each component of the kit is usually enclosed within an individual 
10 container and all of the various containers are within a single package along with 
instructions for use. 

5.7. Screening Assays 

The invention provides methods for the identification of a compound that inhibits 
the ability of hSARS virus to infect a host or a host cell. In certain embodiments, the 

15 invention provides methods for the identification of a compound that reduces the ability 
of hSARS virus to replicate in a host or a host cell. Any technique well-known to the 
skilled artisan can be used to screen for a compound that would abolish or reduce the 
ability of hSARS virus to infect a host and/or to replicate in a host or a host cell. 

In certain embodiments, the invention provides methods for the identification of a 

20 compound that inhibits the ability of hSARS virus to replicate in a mammal or a 

mammalian cell. More specifically, the invention provides methods for the identification 
of a compound that inhibits the ability of hS ARS virus to infect a mammal or a 
mammalian cell. In certain embodiments, the invention provides methods for the 
identification of a compound that inhibits the ability of hSARS virus to replicate in a 

25 mammalian cell. In a specific embodiment, the mammalian cell is a human cell. 

In another embodiment, a cell is contacted with a test compound and infected with 
the hSARS virus. In certain embodiments, a control culture is infected with the hSARS 
virus in the absence of a test compound. The cell can be contacted with a test compound 
before, concurrently with, or subsequent to the infection with the hSARS virus. In a 

30 specific embodiment, the cell is a mammalian cell. In an even more specific embodiment, 
the cell is a human cell. In certain embodiments, the cell is incubated with the test 
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compound for at least 1 minute, 5 minutes, 15 minutes, 30 minutes, 1 hour, 2 hours, 5 
hours, 12 hours, or 1 day. The titer of the virus can be measured at any time during the 
assay. In certain embodiments, a time course of viral growth in the culture is determined. 
If the viral growth is inhibited or reduced in the presence of the test compound, the test 
5 compound is identified as being effective in inhibiting or reducing the growth or infection 
of the hSARS virus. In a specific embodiment, the compound that inhibits or reduces the 
growth of the hSARS virus is tested for its ability to inhibit or reduce the growth rate of 
other viruses and/or to test its specificity for the hSARS virus. 

In one embodiment, a test compound is administered to a model animal and the 

10 model animal is infected with the hSARS virus. In certain embodiments, a control model 
animal is infected with the hSARS virus without the administration of a test compound. 
The test compound can be administered before, concurrently with, or subsequent to the 
infection with the hSARS virus. In a specific embodiment, the model animal is a 
mammal. In an even more specific embodiment, the model animal can be, but is not 

15 limited to, a cotton rat, a mouse, or a monkey. The titer of the virus in the model animal 
can be measured at any time during the assay. In certain embodiments, a time course of 
viral growth in the culture is determined. If the viral growth is inhibited or reduced in the 
presence of the test compound, the test compound is identified as being effective in 
inhibiting or reducing the growth or infection of the hSARS virus. In a specific 

20 embodiment, the compound that inhibits or reduces the growth of the hSARS virus in the 
model animal is tested for its ability to inhibit or reduce the growth rate of other viruses 
to test its specificity for the hSARS virus. 

6. EXAMPLES 

The following examples illustrate the isolation and identification of the novel 
25 hSARS virus. These examples should not be construed as limiting. 

METHODS AND RESULTS 

As a general reference, Wiedbrauk DL & Johnston SLG (Manual of Clinical 
Virology, Raven Press, New York, 1993) was used. 

30 6.1. Clinical Subjects 
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The study included all 50 patients who fitted a modified World Health 
Organization (WHO) definition of SARS and were admitted to 2 acute regional hospitals 
in Hong Kong Special Administrative Region (HKSAR) between February 26 to March 
26, 2003 (WHO. Severe acute respiratory syndrome (SARS) 2000, Weekly Epidemiol Rec, 
5 78:81-83). A lung biopsy from an additional patient, who had typical SARS and was 
admitted to a third hospital, was also included in the study. Briefly, the case definition 
for SARS was: (i) fever of 38°C or more; (ii) cough or shortness of breath; (iii) new 
pulmonary infiltrates on chest radiograph; and (iv) either a history of exposure to a 
patient with SARS or absence of response to empirical antimicrobial coverage for typical 

10 and atypical pneumonia (beta-lactams and macrolides, fluoroquinolones or tetracyclines). 

Nasopharyngeal aspirates and serum samples were collected from all patients. 
Paired acute and convalescent sera and feces were available from some patients. Lung 
biopsy tissue from one patient was processed for a viral culture, RT-PCR, routine 
histopathological examination, and electron microscopy. Nasopharyngeal aspirates, feces 

15 and sera submitted for microbiological investigation of other diseases were included in 
the study under blinding and served as controls. 

The medical records were reviewed retrospectively by the attending physicians 
and clinical microbiologists. Routine hematological, biochemical and microbiological 
examinations, including bacterial culture of blood and sputum, serological study and 

20 collection of nasopharyngeal aspirates for virological tests, were carried out. 

6.2. Cell Line 

FRhK-4 (fetal rhesus monkey kidney) cells were maintained in minimal essential 
medium (MEM) with 1% fetal calf serum, 1% streptomycin and penicillin, 0.2% nystatin 
and 0.05% garamycin. 

25 6.3. Viral Infection 

Two-hundred \ii of clinical (nasopharyngeal aspirates) samples from two patients 
(see the Result section, infi'd) in virus transport medium were used to infect FRhk-4 cells. 
The inoculated cells were incubated at 37°C for 1 hour. One ml of MEM containing 1 |ig 
trypsin was then added to the culture and the infected cells were incubated in a 37°C 
30 incubator supplied with 5% carbon dioxide. Cytopathic effects were observed in the 

infected cells after 2 to 4 days of incubation. The infected cells were passaged into new 
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FRliK-4 cells and cytopathic effects were observed within 1 day after the inoculation. 
The infected cells were tested by an immunofluorescent assay for influenza A, influenza 
B, respiratory syncytial virus, parainfluenza types 1, 2 and 3, adenovirus and human 
metapneumovirus (hMPV) and negative results were obtained for all cases. The infected 
5 cells were also tested by RT-PCR for influenza A and human metapneumovirus with 
negative results. 

6.4. Virus Morphology 

The infected cells prepared as described above were harvested, pelleted by 
centrifugation and the cell pellets were processed for thin-section transmitted electron 

10 microscopic visualization. Viral particles were identified in the cells infected with both 
clinical specimens, but not in control cells which were not infected with the virus. 
Virions isolated from the infected cells were about 70-100 nanometers (Figure 2). Viral 
capsids were found predominantly within the vesicles of the golgi and endoplasmic 
reticulum and were not free in the cytoplasm. Virus particles were also found at the cell 

15 membrane. 

One virus isolate was ultracentrifuged and the cell pellet was negatively stained 
using phosphotugstic acid. Virus particles characteristic of Coronaviridae were thus 
visualized. Since the human Coronaviruses hitherto recognized are not known to cause a 
similar disease, the present inventors postulated that the virus isolates represent a novel 
20 virus that infects humans. 

6.5. Antibody Response 

To further confirm that this novel virus is responsible for causing SARS in the 
infected patients, blood serum samples from the patients who were suffering from SARS 
were obtained and a neutralization test was performed. Typically diluted serum (x50, 

25 x200, x800 and xl600) was incubated with acetone-fixed FRhK-4 cells infected with 
hSARS virus at 37°C for 45 minutes. The incubated cells were then washed with 
phosphate-buffered saline and stained with anti-human IgG-FITC conjugated antibody. 
The cells were then washed and examined under a fluorescent microscope. In these 
experiments, positive signals were found in 8 patients who had SARS (Figure 3), 

30 indicating that these patients had an IgG antibody response to this novel human 

respiratory virus of Coronaviridae. By contrast, no signal was detected in 4 negative- 
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control paired sera. The serum titers of anti- hS ARS antibodies of the tested patients are 
shown in Table 1 . 







Table 1 




Name 


Date 


Lab No. 


Anti-SARS 


Patient A 


25-Feb-03 


S2728 


<50 




6-Mar-03 


S2728 


1600 


Patient B 


26-Feb-03 


S2441 


50 




3-Mar-03 


S2441 


200 


Patient C 


4-Mar-03 


S3279 


200 




14-Mar-03 


S3279 


1600 


Patient D 


6-Mar-03 


M41045 


<50 




ll-Mar-03 


MB943703 


800 


Patient E 


4-Mar-03 


M38953 


<50 




18-Mar-03 


KWH03/3601 


800 


Control F 


13-Feb-03 


M27124 


<50 




l-Mar-03 


MB942968 


<50 


Patient G 


3-Mar-03 


M38685 


<50 




7-Mar-03 


KWH03/2900 


Equivocal 




Blinded samples: 








la* 


Acute 




<50 


lb 


Convalescent 




1600 


2a* 


Acute 




50 


2b 


Convalescent 




>1600 


3a* 


Acute 




50 


3b 


Convalescent 




>1600 


4a* 


Acute 




<50 


4b 


Convalescent 




<50 


5a* 


Acute 




<50 


5b 


Convaelscent 




<50 


6a* 


Acute 




<50 


6b 


Convalescent 




<50 



5 NB: * patients with SARS 

These results indicated that this novel member of Coronaviridae is a key 
pathogen in SARS. 

6.6. Sequences of the hSARS Virus 

Total RNA from infected or uninfected FrHK-4 cells was harvested two days 
1 0 post-infection. One-hundred ng of purified RNA was reverse transcribed using 

Superscript® II reverse transcriptase (Invitrogen) in a 20 jlxI reaction mixture containing 
10 pg of a degenerated primer (5 ? -GCCGGAGCTCTGC AGAATTCNNNNNNN-3 ' : SEQ 
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ID NO: 5; N=A ? T, G or C) as recommended by the manufacturer. Reverse transcribed 
products were then purified by a QIAquick® PGR purification kit as instructed by the 
manufacturer and eluted in 30 \il of 10 mM Tris-HCl, pH 8.0 . Three [il of purified 
cDNA products were add in a 25 jil reaction mixture containing 2.5 jlxI of IOx PGR buffer, 
4 jil of 25mM MgCl 2? 0.5 jlxI of 10 mM dNTP, 0.25 jlxI of AmpliTaq Gold® DNA 
polymerase (Applied Biosystems), 2.5 \xd of [a- 32 P]CTP (Amersham), 2 |ul of 10 pM 
primer (5'-GCCGGAGCTCTGCAGAATT-C-3\ SEQ ID NO: 6). Reactions were 
thermal cycled through the following profile: 94°C for 8 min followed by 2 cycles of 
94°C for 1 min, 40°C for 1 min, 72°C for 2 min. This temperature profile was followed 
by 35 cycles of 94°C for 1 min, 60°C for 1 min, 72°C for 1 min. 6 \il of the PGR products 
were analyzed in a 5% denaturing polyacrylamide gel electrophoresis. Gel was exposed 
to X-ray film and the film was developed after an over-night exposure. Unique PGR 
products which were only identified in infected cell samples were isolated from the gel 
and eluted in a 50 pi of lx TE buffer. Eluted PCR products were then re-amplified in 25 
jil of reaction mixture containing 2.5 [il of lOx PCR buffer, 4 \il of 25 mM MgCl 2? 0.5 \d 
ru 10 mM dNTP, 0.25 \il of AmpliTaq Gold® DNA polymerase (Applied Biosystems), 1 
|ul of 10 nM primer (5 5 -GCCGGAGCTCTGC AGAATTC-3 ' , SEQ ID NO:6). Reaction 
mixtures were thermal cycled through the following profile: 94°C for 8 min followed by 
35 cycles of 94°C for 1 min, 60°C for 1 min, 72°C for 1 min. PCR products were 
cloned using a TOPO TA Cloning® kit (Invitrogen) and ligated plasmids were 
transformed into TOP 10 K coli competent cells (Invitrogen). PCR inserts were 
sequenced by a BigDye® cycle sequencing kit as recommended by the manufacturer 
(Applied Biosystems) and sequencing products were analyzed by an automatic sequencer 
(Applied Biosystems, model number 3770). The obtained sequence (SEQ ED NO: 1) is 
shown in Figure 1 . The deducted amino acid sequence from the obtained DNA sequence 
(SEQ ID NO: 2) showed 57% homology to the polymerase protein of identified 
Coronaviruses. 

Similarly, two other partial sequences (SEQ ID NOS: 1 1 and 13) and deduced 
amino acid sequences (SEQ ID NOS: 12 and 14, respectively) were obtained from the 
hSARS virus and are shown in Figures 8 (SEQ ID NOS: 1 1 and 12) and 9 (SEQ ID 
NOS: 13 and 14). 
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The entire genomic sequence of hSARS virus is shown in Figure 10 (SEQ ID 
NO:15). The deduced amino acid sequences of SEQ ID NO:15 in all three frames are 
shown in Figure 1 1 (DNA sequences shown in SEQ ID NOS: 16, 240 and 737; for amino 
acid sequences, see SEQ ID NOS: 17-239, 241-736 and 738-1 107, respectively). The 
5 deduced amino acid sequences of the complement of SEQ ID NO: 15 in all three frames 
are shown in Figure 12 (DNA sequences shown in SEQ ID NOS: 1 108, 1590 and 1965; 
for amino acid sequences, see SEQ ID NOS: 1109-1589, 1591-1964 and 1966-2470, 
respectively). 

6.7. Detection of the hSARS Virus in Nasopharyngeal Aspirates 

10 First, the nasopharyngeal aspirates (NPA) were examined by rapid 

immunoflourescent antigen detection for influenza A and B, parainfluenza types 1, 2 and 
3, respiratory syncytial virus and adenovirus (Chan KH, Maldeis N, Pope W, Yup A, 
Ozinskas A. Gill J, Seto WH, Shortridge KF, Peiris JSM. Evaluation of Directigen Fly 
A+B test for rapid diagnosis of influenza A and B virus infections. J Clin Microbiol 

15 2002; 40: 1675-1680) and were cultured for conventional respiratory pathogens on 

Mardin Darby Canine Kidney, LLC-Mk2, RDE, Hep-2 and MRC-5 cells (Wiedbrauk DL, 
Johnston SLG. Manual of clinical virology. Raven Press, New York. 1993). 
Subsequently, fetal rhesus kidney (FRhk-4) and A-549 cells were added to the panel of 
cell lines used. Reverse transcription polymerase chain reaction (RT-PCR) was 

20 performed directly on the clinical specimen for influenza A (Fouchier RA, Bestebroer 

TM, Herfst S, Van Der Kemp L, Rimmelzwan GF, Osterhaus AD. Detection of influenza 
A virus from different species by PGR amplification of conserved sequences in the 
matrix gene. J Clin Microbiol. 2000; 38: 4096-101) and human metapneumovirus 
(HMPV). The primers used for HMPV were! for first round, 5'- 

25 AARGTSAATGCATCAGC-3 ' (SEQ ID NO. 7) and 5 '- 

CAKATTYTGCTTATGCTTTC-3 ' (SEQ ID NO:8); and nested primers: 5'- 
AC ACCTGTTACAATACCAGC-3 ' (SEQ ID NO:9) and 5'- 
GACTTGAGTCCCAGCTCCA-3 3 (SEQ ID NO: 10). The size of the nested PGR 
product was 201 bp. An ELISA for mycoplasma was used to screen'cell cultures (Roche 

30 Diagnostics GmbH, Roche, Indianapolis, USA). 

6.7.1. RT-PCR Assay 
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Subsequent to culturing and genetic sequencing of the hSARS virus from two 
patients (see Section 6.6, supra), an RT-PCR was developed to detect the hSARS virus 
sequence from NPA samples. Total RNA from clinical samples was reverse transcribed 
using random hexamers and cDNA was amplified using primers 5'- 
TACAC ACCTC AGC-GTTG-3 ' (3EQ ID NO:3) and 5 ' -C ACGAACGTGACG AAT-3 ' 
(SEQ ID NO:4), which are constructed based on the hSARS viral genome, in the 
presence of 2.5 mM MgCl 2 (94°C for 8 min followed by 40 cycles of 94°C for 1 min, 
50°C for 1 min, 72°C for 1 min). 

The summary of a typical RT-PCR protocol is as follows: 
RNA extraction 

RNA from 140 ul of NPA samples is extracted by QIAquick® viral RNA 
extraction kit and is eluted in 50 ul of elution buffer. 



Reverse transcription 

RNA 11.5 ul 

0.1MDTT 2 Ml 

5x buffer 4 |fi 

10 mM dNTP 1 ^1 
Superscript II, 200 U/ul (Invitrogen) 1 ul 

Random hexamers, 0.3 u.g/ u.1 0.5 ul 



Reaction condition: 42°C, 50 min 

94°C, 3 min 
4°C 

PCR 

cDNA generated by random primers is amplified in a 50 ul reaction as follows: 



cDNA 2 ul 

10 mM dNTP 0.5 ul 

1 Ox buffer 5 jjj 

25 mM MgCl 2 5 ^1 
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25 |oM Forward primer 
25 |lxM Reverse primer 



0.5 |Lil 
0.5 |Lil 



AmpliTaq Gold® polymerase, 5U/|al 
(Applied Bio systems) 



0.25 jlxI 
36.25 (Lil 



Water 



Thermal-cycle condition: 95°C 3 10 min, followed by 40 cycles of 95°C, 1 min; 
50°C 1 min; 72°C, 1 min. 

Primer Sequencer 

Primers were designed based on the RNA- d ep end ent RNA polymerase encoding 
sequence (SEQ ID NO:l) of the hSARS virus. 

Forward primer: 5' TACAC ACCTCAGCGTTG 3 ' (SEQ ID NO:3) 
Reverse primer: 5' CACGAACGTGACGAAT 3' (SEQ IDNO:4) 

Product (amplicon) size: 182 bps 
Real-Time Quantitative PCR Assay 

Total RNA from 140 \il of nasopharyngeal aspirate (NPA) was extracted by 
QIAamp® virus RNA mini kit (Qiagen) as instructed by the manufacturer. Ten |Ltl of 
eluted RNA samples were reverse transcribed by 200 U of Superscript® II reverse 
transcriptase (Invitrogen) in a 20 jal reaction mixture containing 0. 15 [xg of random 
hexamers, 10 mmol/L DTT, and 0.5 mmol/L dNTP, as instructed. Complementary DNA 
was then amplified in a SYBR Green I fluorescence reaction (Roche) mixtures. Briefly, 
20 |o,l reaction mixtures containing 2\A of cDNA, 3.5 mmol/L MgCl 2 , 0.25 \xmoVL of 
forward primer (5'-TACACACCTCAGCGTTG-3'; SEQ ID NO:3) and 0.25 |nmol/L 
reverse primer (5 ' -CACGAACGTGACGAAT-3 ' ; SEQ ID NO:4) were thermal-cycled 
by a Light-Cycle® (Roche) with the PCR program, (95°C, 10 min followed by 50 cycles 
of 95°C for 10 min; 57°C for 5 sees; and 72°C for 9 sees). Plasmids containing the target 
sequence were used as positive controls. Fluorescence signals from these reactions were 
captured at the end of extension step in each cycle. To determine the specificity of the 
assay, PCR products (184 base pairs) were subjected to a melting curve analysis at the 
end of the assay (65°C to 95°C, 0.1 °C per second). 
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QLJNICAL INSULTS 
Clinical findings : 

All 50 patients with SARS were ethnic Chinese. They represented 5 different 
epidemiological^ linked clusters as well as additional sporadic cases fitting the case 
definition. The)' were hospitalized at a mean of 5 days after the onset of symptoms. The 
median age was 42 years (range of 23 to 74) and the female to male ratio was 1.3. 
Fourteen (28%) were health care workers and five (10%) had a history of visit to a 
hospital experiencing a major outbreak of SARS. Thirteen (26%) patients had household 
contacts and 12 (24%) others had social contacts with patients with SARS. Four (8%) 
had a history of recent travel to mainland China. 

The major complaints from most patients were fever (90%) and shortness of 
breath. Cough and myalgia were present in more than half the patients (Table 2). Upper 
respiratory tract symptoms such as rhinorrhea (24%) and sore throat (20%) were present 
in a minority of patients. Diarrhea (10%) and anorexia (10%) were also reported. At 
initial examination, auscultatory findings, such as crepitations and decreased air entry, 
were present in only 38% of patients. Dry cough was reported by 62% of patients. All 
patients had radiological evidence of consolidation, at the time of admission, involving 1 
zone (in 36), 2 zones (13) and 3 zones (1). 



Table2 

Clinical symptoms Number (percentage) 
"Fever """""'^0000%) 

Chill or rigors 37(74%) 

Cough 31 (62%) 

Myalgia 27 (54%) 

Malaise 25 (50%) 

Running nose 12 (24%) 

Sore throat 10 (20%) 

Shortness of breath 1 0 (20%) 

Anorexia 10 (20%) 

Diarrhea 5 (10%) 

Headache 10 (20%) 

Dizziness 6 (12%) 

* Truncal maculopapular rash was noted in 1 patient. 
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In spite of the high fever, most patients (98%) had no evidence of a leukocytosis. 
Lymphopenia (68%), leucopenia (26%), thrombocytopenia (40%) and anemia (18%) 
were present in peripheral blood examination (Table 3). The levels of parenchymal liver 
enzyme, alanine aminotransferase (ALT) and muscle enzyme, creatinine kinase (CPK) 
were elevated in 34% and 26% of patients, respectively. 



Table 3 



Laboratory parameter 


Mean (range) 


Percentage of abnormal 


Normal range 


Haemoglobin 


12.9(8.9-15.9) 




11.5- 16.5 g/dl 


Anaemia 




9 (18%) 




White cell count 


5.17 (1.1-11.4) 




4-llxl0 9 /L 


Leucopenia 




13 (26%) 




Lymphocyte count 


0.78 (0.3 - 1.5) 




L5-4.0xl0 9 /L 


Significant lymphopenia 




34 (68%) 




(<1.0 x lCr /L) 








Platelet count 


174 (88-351) 




150-400xl0 9 /L 


Tiirombocytopenia 




20 (40%) 




Alanine ammotransaminase (ALT) 


63 (11 -350) 




6 - 53 U/L 


Elevated ALT 




17 (34%) 




Albumin 


37 (26 - 50) 




42 - 54 g/L 


Low albumin 




34 (68%) 




Globulin 


33 (21 -42) 




24 -36g/L 


Elevated globulin 




10 (20%) 




Creatinine kinase 


244 (31 - 1379) 




34 -138 U/L 


Elevated creatinine kinase 




13 (26%) 





Routine microbiological investigations for known viruses and bacteria by culture, 
antigen detection, and PGR were negative in most cases. Blood culture was positive for 
Escherichia coli in a 74-year-old male patient, who was admitted to intensive care unit, 
and was attributed to hospital acquired urinary tract infection. Klebsiella pneumoniae 
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and Hemophilus influenzae were isolated from the sputum specimens of 2 other patients 
on admission. 

Oral levofloxacin 500 mg q24h was given in 9 patients and intravenous (1.2 g 
q8h)/ oral (375 mg tid) amoxicillin-clavulanate and intravenous/oral clarithromycin 500 
5 mg ql2h were given in another 40 patients. Four patients were given oral oseltamivir 75 
mg bid. In one patient, intravenous ceftriaxone 2 gm q24h, oral azithromycin 500 mg 
q24h ? and oral amantadine 100 mg bid were given for empirical coverage of typical and 
atypical pneumonia. 

Nineteen patients progressed to severe disease with oxygen desaturation and were 
10 required intensive care and ventilatory support. The mean number of days of 

deterioration from the onset of symptoms was 8.3 days. Intravenous ribavirin 8 mg/kg 
q8h and steroid was given in 49 patients at a mean day of 6.7 after onset of symptoms. 

The risk factors associated with severe complicated disease requiring intensive 
care and ventilatory support were older age, lymphopenia, impaired ALT, and delayed 
15 initiation of ribavirin and steroid (Table 4). All the complicated cases were treated with 
ribavirin and steroid after admission to the intensive care unit whereas all the 
uncomplicated cases were started on ribavirin and steroid in the general ward. As 
expected, 3 1 uncomplicated cases recovered or improved whereas 8 complicated cases 
deteriorated with one death at the time of writing. All 50 patients were monitored for a 
20 mean of 12 days at the time of writing. 
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Table 4 
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(n= 3 1 ^ 
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lvicclii yoLJ'j ct^c cuigc j 
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IvidJLC / Fcllldle I allU 


0/11 
0/11 
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ijiiiicriyiiig limess 


c t 
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1 
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Travel to China 


1 
1 
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Q 
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Hospital visit 


1 
1 


4 
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0 




p <- n ac 


Social contact 


4 


1 0 

1 \J 
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1 A 

1 .4 


1.2 


N.S. 


jvrean ^jj ^ aay or deterioration irom tne 


8.3 ± 2.6 


Not applicable 




onset of symptoms § 








Mean (SD) day of initiation of Ribavirin 


7.7 + 2.9 


5.7 + 2.6 


P < 0.05 


& steroid from the onset of symptoms 








Initiation of ribavirin & steroid after 


12 


0 


P < 0.001 


deterioration 








Response to ribavirin & steroid 


11 


28 


P < 0.05 


Outcome 








Improved or recovered 


10 


31 


P <0.01 


Not improving II - 


8 


0 


P < 0.01 



* Multi-variant analysis is not performed due to low number of cases; 



1 2 patients had diabetic mellitus, 1 had hypertrophic ostructive cardiomyopathy, 1 
5 had chronic active hepatitis B ? and 1 had brain tumour; 

* 1 patient had essential hypertension; 
§ desaturation requiring intensive care support; 
|| 1 died. 
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Two virus isolates, subsequently identified as a member of Coronaviridae (see 
below), were isolated from two patients. One was from an open lung biopsy tissue of a 
53 -year-old Hong Kong Chinese resident and the other from a nasopharyngeal aspirate of 
a 42 year-old female with good previous health. The 53-year old male had a history of 
5 10-hour household contact with a Chinese visitor who came from Guangzhou and later 
died from SARS. Two days after this exposure, he presented with fever, malaise, 
myalgia, and headache. Crepitations were present over the right lower zone and there 
was a corresponding alevolar shadow on the chest radiograph. Hematological 
investigation revealed lymphopenia of 0.7 x 10 9 /L with normal total white cell and 

10 platelet counts. Both ALT (41 U/L) and CPK (405 U/L) were impaired. Despite a 

combination of oral azithromycin, amantadine, and intravenous ceftriaxone, there was 
increasing bilateral pulmonary infiltrates and progressive oxygen desaturation. Therefore, 
an open lung biopsy was performed 9 days after admission. Histopathological 
examination showed a mild interstitial inflammation with scattered alveolar pneumocytes 

15 showing cytomegaly, granular amphophilic cytoplasm and enlarged nuclei with 

prominent nucleoli. No cells showed inclusions typical of herpesvirus or adenovirus 
infection. The patient required ventilation and intensive care after the operative 
procedure. Empirical intravenous ribavirin and hydrocortisone were given. He 
succumbed 20 days after admission. In retrospect, coronavirus-like RNA was detected in 

20 his nasopharyngeal aspirate, lung biopsy and post-mortem lung. He had a significant rise 
in titer of antibodies against his own hSARS isolate from 1/200 to 1/1600. 

The second patient from whom an hSARS virus was isolated, was a 42-year-old 
female with good past health. She had a history of traveling to Guangzhou in mainland 
China for 2 days. She presented with fever and diarrhea 5 days after her return to Hong 

25 Kong. Physical examination showed crepitation over the right lower zone which had a 
corresponding alveolar shadow on the chest radiograph. Investigation revealed 
leucopenia (2.7 x 10 9 /L), lymphopenia (0.6 x 10 9 /L), and thrombocytopenia (104 x 10 9 /L). 
Despite the empirical antimicrobial coverage with amoxicillin-clavulanate, 
clarithromycin, and oseltamivir, she deteriorated 5 days after admission and required 

30 mechanical ventilation and intensive care for 5 days. She gradually improved without 
receiving treatment with ribavirin or steroid. Her nasopharyngeal aspirate was positive 
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for the virus in the RT-PCR and she was seroconverted from antibody titre <l/50 to 
1/1600 against the hSARS isolate. 
Virological findings : 

Viruses were isolated on FRlik-4 cells from the lung biopsy and nasopharyngeal 
5 aspirate respectively, of two patients described above. The initial cytopathic effect 

appeared between 2 and 4 days after inoculation, but on subsequent passage, cytopathic 
effect appeared in 24 hours. Both virus isolates did not react with the routine panel of 
reagents used to identify virus isolates including those for influenza A, B, parainfluenza 
types 1, 2, and 3, adenovirus and respiratory sync}'tial virus (DAKO, Glostrup ? Denmark). 

1 0 They also failed to react in RT-PCR assays for influenza A and HMPV or in PGR assays 
for mycoplasma. The virus was ether sensitive, indicating that it was an enveloped virus. 
Electron microscopy of negatively stained (2% potassium phospho-tungstate, pH 7.0) cell 
culture extracts obtained by ultracentrifugation showed the presence of pleomorphic 
enveloped viral particles, of about 80-90 nm (ranging 70-130 nm) in diameter, whose 

15 surface morphology appeared comparable to members of Coronaviridae (Figure 5 A). 
Thin section electron microscopy of infected cells revealed virus particles of 55-90 nm 
diameter within the smooth-walled vesicles in the cytoplasm (Figures 5 A and 5B). Virus 
particles were also seen at the cell surface. The overall findings were compatible with 
infections in the cells caused by viruses of Coronaviridae. 

20 A thin section electron micrograph of the lung biopsy of the 53 year old male 

contained 60-90-nm viral particles in the cytoplasm of desquamated cells. These viral 
particles were similar in size and morphology to those observed in the cell-cultured virus 
isolate from both patients (Figure 4). 

The RT-PCR products generated in a random primer RT-PCR assay were 

25 analyzed and unique bands found in the virus infected specimen were cloned and 

sequenced. Of 30 clones examined, a clone containing 646 base pairs (SEQ ID NO: 1) of 
unknown origin was identified. Sequence analysis of this DNA fragment suggested this 
sequence had a weak homology to viruses of the family of Coronaviridae (data not 
shown). Deducted amino acid sequence (215 amino acids, SEQ ID NO: 2) from this 

30 unknown sequence, however, had the highest homology (57%) to the RNA polymerase of 
bovine coronavirus and murine hepatitis virus, confirming that this virus belongs to the 
family of Coronaviridae. Phylogenetic analysis of the protein sequences showed that this 
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virus, though most closely related to the group II coronaviruses, was a distinct virus 
(Figures 5 A and 5B). 

Based on the 646 bp sequence of the isolate, specific primers for detecting the 
new virus was designed for RT-PCR detection of this hS ARS virus genome in clinical 
5 specimens. Of the 44 nasopharyngeal specimens available from the 50 SAKS patients, 
22 had evidence of hSARS RNA. Viral RNA was detectable in 10 of 18 fecal samples 
tested. The specificity of the RT-PCK reaction was confirmed by sequencing selected 
positive RT-PCR amplified products. None of the 40 nasopharyngeal and fecal 
specimens from patients with unrelated diseases were reactive in the RT-PCR assay. 

10 To determine the dynamic range of real-time quantitative PGR, serial dilutions of 

plasmid DNA containing the target sequence were made and subjected to the real-time 
quantitative PGR assay. As shown in Figure 7 A, the assay was able to detect as little as 
10 copies of the target sequence. By contrast, no signal was observed in the water control 
(Figure 7 A). Positive signals were observed in 23 out of 29 serologically confirmed 

15 SARS patients. In all of these positive cases, a unique PGR product (T m = 82°C) 

corresponding to the signal from the positive control was observed (Figure 7B, and data 
not shown). These results indicated this assay is highly specific to the target. The copy 
numbers of the target sequence in these reactions range from 4539 to less than 10. Thus, 
as high as 6.48 x 10 5 copies of this viral sequence could be found in 1 ml of NPA sample. 

20 In 5 of the above positive cases, it was possible to collect NPA samples before 

seroconvertion. Viral RNA was detected in 3 of these samples, indicating that this assay 
can detect the virus even at the early onset of infection. 

To further validate the specificity of this assay, NPA samples from healthy 
individuals (n=ll) and patients who suffered from adenovirus (n=l 1), respiratory 

25 syncytial virus (n=l 1), human metapneumovirus (n=l 1), influenza A virus (n=13) or 
influenza B virus (n=l) infection were recruited as negative controls. All of these 
samples, except one, were negative in the assay. The false positive case was negative in 
a subsequence test. Taken together, including the initial false positive case, the real-time 
quantitative PGR assay has sensitivity of 79% and specificity of 98%. 

30 Epidemiological data suggest that droplet transmission is one of the major route 

of transmission of this virus. The detection of live virus and the detection of high copies 
of viral sequence from NPA samples in the current study clearly support that cough and 
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sneeze droplets from SARS patients might be the major source of this infectious agent. 
Interestingly, 2 out of 4 available stool samples form the SAEA patients in this study 
were positive in the assay (data not shown). The detection of the virus in feces suggests 
that there might be other routes of transmission. It is relevant to note that a number of 
animal coronaviruses are spread via the fecal-oral route (Mcintosh K., 1974, 
Coronaviruses: a comparative review. Current Top Microbiol Immunol 63; 85-112). 
However, further studies are required to test whether the virus in feces is infectious or not. 

Currently, apart form this hSARS virus, there are two known serogroups of 
human coronaviruses (229E and OC43) (Hruskova J. et aL, 1990, Antibodies to human 
coronaviruses 229E and OC43 in the population of C.R., Acta Virol 34:346-52). The 
primer sets used in the present assay do not have homology to the strain 229E. Due to 
the lack of available corresponding OC43 sequence in the Genebank, it is not known 
whether these primers would cross-react with this strain. However, sequence analyses of 
available sequences in other regions of OC43 polymerase gene indicate that the novel 
human virus associated with SARS is genetically distinct from OC43. Furthermore, the 
primers used in this study do not have homology to any of the sequences from known 
coronaviruses. Thus, it is very unlikely that these primers would cross-react with the 
strain OC43. 

Apart from the novel pathogen, metapneumovirus was reported to be identified in 
some of SARS patients (Center for Disease Control and Prevention, 2003, Morbidity and 
Mortality Weekly Report 52: 269-272). No evidence of metapneumovirus infection was 
detected in any of the patients in this study (data not shown), suggesting that the novel 
hSARS virus of the invention is the key player in the pathogenesis of SARS. 

Immunofluorescent antibody detection : 

Thirty-five of the 50 most recent serum samples from patients with SARS had 
evidence of antibodies to the hSARS virus {see Fig. 3). Of 27 patients from whom paired 
acute and convalescent sera were available, all were seroconverted or had >4 fold 
increase in antibody titer to the virus. Five other pairs of sera from additional SARS 
patients from clusters outside this study group were also tested to provide a wider 
sampling of SARS patients in the community and all of them were seroconverted. None 
of 80 sera from patients with respiratory or other diseases as well as none of 200 normal 
blood donors had detectable antibody. 
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When either seropositivity to HP-CV in a single serum or viral RNA detection in 
the NPA or stool are considered evidence of infection with the hSARS virus, 45 of the 50 
patients had evidence of infection. Of the 5 patients without any virological evidence of 
Coronaviridae viral infection, only one of these patients had their sera tested > 14 days 
5 after onset of clinical disease. 

6.8. A Quantitative TaqMan® Assay For hSARS Virus Detection 

6.8.1. Materials and Methods 
Patients and sample collection 

Stored clinical specimens from 50 patients fulfilling the clinical WHO case 
10 definition of SARS (http://www.who.int/csr/sars/casedefinition/en/) in whom the 

diagnosis was subsequently confirmed by seroconversion were used in this study. NPA 
samples were collected from days 1-3 of disease onset as described previously (Poon et 
al 7 2003, Clin. Chem. 49:953-955). NPA samples from patients with unrelated diseases 
were recruited as controls. 
15 RNA extraction and reverse transcription 

RNA from clinical samples was extracted using the QIAamp® virus RNA mini kit 
(Qiagen) as instructed by the manufacturer. In the previous conventional RT-PCR assay, 
140 julI of NPA was used for RNA extraction. In the revised RNA extraction protocol, 
540 \x\ of NPA was used for RNA extraction. Extracted RNA was finally eluted in 30 juL 
20 of RNase-free water and stored at -20 °C. Total RNA from clinical samples was then 
reverse transcribed using random hexamers. 



Conventional PCR for SARS-CoV 

Conventional PCR assay was performed as described in Section 6.7.1. 

25 

Real-time quantitative PCR assays for SARS-CoV 

A real-time quantitative PCR specific for the lb region of the SAKS -Co v was 
used in this study. Complementary DNA was amplified by a TaqMan® PCR Core 
Reagent kit in a 7000 Sequence Detection System (Applied Biosystems). Briefly, 4 \i\ of 
30 cDNA was amplified in a 25 \x\ reaction containing 0.625 U AmpliTaq Gold® polymerase 
(Applied Biosystems), 2.5 \x\ of lOx TaqMan® buffer A, 0.2 mM of dNTPs, 5.5 mM of 
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MgCl 2 , 2.5 U of AmpErase® UNG, and lx primers-probe mixture (Assays by Design, 
Applied Biosystems). The primer sequences were 5'- 
CAGAACGCTGTAGCTTCAAAAATCT-3 1 (SEQ ID NO:2471) and 5'- 
TC AGAACCCTGTGATGAATC AAC AG-3 1 (SEQ ID NO:2472) and the probe was 5'- 
(F AM)TCTGCGTAGGCAATCC(NFQ)-3 1 (SEQ ID NO:2473; FAM, 6- 
carboxyfluorescein; NFQ, nonfluorescent quencher). Reactions were First incubated at 
50°C for 2 min, followed by 95°C for 10 min. Reaction were then thermal-cycled for 45 
cycles (95°C for 15 sec, 60°C for 1 min). Plasmids containing the target sequences were 
used as positive controls. 

6.8.2. Results 

A total of 50 NPA specimens isolated from serologically confirmed SARS 
patients collected during the first 3 days of illness were studied. Of these, 1 1 (22%) were 
positive in our previously reported conventional RT-PCR assay (See Section 6.7.1) 
(Table 5). 



Table 5 



Day of 
onset 



Sample 
Size 



Number of positives 



Conventional 
RT-PCR 

assay 



Conventional RT-PCR 
assay with a modified 
RNA extraction 
protocol* 



Real-time RT-PCR 
assay with a modified 

RNA extraction 
protocol** 



1 

2 
3 



8 

16 

26 



0 (0%) 
3 (19%) 
8 (31%) 



2 (25%) 
8 (50%) 
12 (46%) 



5 (63%) 
14 (88%) 
21 (81%) 



* The overall detection rate of the assay is statistically different from that of the 
conventional RT-PCR assay (McNemar's test, P<0.001) 

+ The overall detection rate of the assay is statistically different from that of the 
conventional RT-PCR assay with a modified RNA extraction protocol (McNemar's test, 
PO.0001) 
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We reasoned that the poor sensitivity of S ARS-CoV RT-PCR detection in the 
early stage of the illness could be enhanced by increasing the initial extraction volume of 
the NPA sample from 140 to 560 (Jil. Using this modified RNA extraction protocol, the 
sensitivity of the conventional RT-PCR assay doubled from 1 1/50 to 22/50 (Table 5). The 
overall detection rate of the modified RT-PCR protocol was statistically different from 
that of our first generation RT-PCR protocol (McNemar's test, P<0.001, Table 5). Of 30 
negative control samples, one false positive result was observed. With the RNA 
extraction modification, the sensitive and specificity of the conventional RT-PCR on 
specimens collected during the first 3 days of illness was 44.0% and 96.6%, respectively. 

To further improve the detection of SARS-CoV in samples from early onset, we 
adopted a highly sensitive real-time quantitative assay for SARS-CoV detection (Fig. 14). 
With the modified RNA extraction protocol, 40 out of 50 NPA samples were positive in 
the real-time assay (Fig. 15 and Table 5). The overall detection rate of the modified RT- 
PCR protocol was statistically different from the other two assays (McNemar's test, P< 
0.0001, Table 5). In particular, 63% of the NPA samples isolated on day 1 of disease 
onset was positive in the real-time quantitative RT-PCR assay. By contrast, none of the 
specimens isolated on day 1 was positive in the conventional RT-PCR assay. For 
samples isolated on days 2-3, more than 81% of these samples was positive in the 
quantitative assay (Table 5). With the modified RNA extraction protocol and real-time 
PCR technology, the sensitivity and specificity of the quantitative assay towards early 
SARS samples were 80% and 100%, respectively. 

The real-time assay also allowed one to quantitate the viral loads of these clinical 
specimens (1 copy/reaction = 27.8 copies/ml of a NPA sample). As shown in Fig. 16, the 
progression of the disease resulted in an increase of viral loads in NPA (open bars). In 
addition, we further examined the viral loads of clinical samples that were negative (N= 
39) in our first generation RT-PCR assay (Fig. 16, grey bars). As expected, the viral 
loads of these samples (grey bars) were much lower than the overall viral loads of the 
whole cohort (open bars). 

6,8.3. Biscussion 

Our objective of this study was to establish a highly sensitive RT-PCR assay for 
detecting SARS-CoV. In particular, we focused on detecting SARS-CoV RNA in 
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samples isolated on days 1-3 of disease onset. Using our first generation conventional 
RT-PCR assay, only 22% of these samples were shown to have SARS-CoV RNA. In 
order to establish a more sensitive assay, we modified the RNA extraction method and 
adapted the quantitative technology in our current study. By increasing the initial volume 
for RNA extraction from 140 jal to 540 the proportion of positive cases was increased 
to 44%. In addition, by further applying the real-time quantitative PGR technology in the 
revised assay, 80% of early SARS samples became positive. More importantly, the use 
of a 5' nuclease probe in the real-time quantitative assay can minimize the false positive 
rate due to an increase in signal specificity. Taken together, results from this study 
suggested that our revised RT-PCR assay allows the early and accurate diagnosis of 
SARS. 

The quantitative result of our modified RT-PCR assay provided further 
information regarding the viral load of SARS-CoV in these clinical specimens. Our 
results indicated that the viral load increases as the disease progresses. Of those samples 
that were negative in the first generation RT-PCR assay, all contained very low amounts 
of viral RNA (Figs. 15 and 16). This observation explained why most of these samples 
were negative using our first generation RT-PCR assay. Interestingly, for those 
specimens that were positive in the first generation assay, some had very high amounts of 
viral RNA (Fig. 16). 

In summary, by increasing the initial sample volume for RNA extraction and 
utilizing real-time quantitative PGR technology, we established a sensitive and accurate 
RT-PCR assay for the prompt identification of SARS-CoV. It is expected that, with this 
rapid diagnostic method, a prompt identification of this pathogen will facilitate the 
control of the disease and the institution of prompt treatment. 

6.9. Clinical observations and Discussion 

The outbreak of SARS is unusual in a number of aspects, in particular, in the 
appearance of clusters of patients with pneumonia in health care workers and family 
contacts. In this series of patients with SARS, investigations for conventional pathogens 
of atypical pneumonia proved negative. However, a virus that belongs to the family 
Coronaviridae was isolated from the lung biopsy and nasopharyngeal aspirate obtained 
from two SARS patients, respectively. Phylogenetically, the virus was not closely related 
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to any known human or animal coronavirus or torovirus. The present analysis is based on 
a 646 bp fragment (SEQ ID NO: 1) of the polymerase gene, which indicates that the virus 
relates to antigenic group 2 of the coronaviruses along with murine hepatitis virus and 
bovine coronavirus. However, viruses of the Coronaviridae can undergo heterologous 
5 recombination within the virus family and genetic analysis of other parts of the genome 
needs to be carried out before the nature of this new virus is more conclusively defined 
(Holmes KV. Coronaviruses. Eds Knipe DM, Howley PM Fields Virology, 4th Edition, 
Lippincott Williams 8c Wilkins, Philadelphia, pp. 1187-1203). The biological, genetic 
and clinical data, taken together, indicate that the new virus is not one of the two known 

10 human coronaviruses. 

The majority (90%) of patients with clinically defined SARS had either 
serological or RT-PCR evidence of infection by this virus. In contrast, neither antibody 
nor viral RNA was detectable in healthy controls. All 27 patients from whom acute and 
convalescent sera were available demonstrated rising antibody titers to hS ARS virus, 

15 strengthening the contention that a recent infection with this virus is a necessary factor in 
the evolution of SARS. In addition, all five pairs of acute and convalescent sera tested 
from patients from other hospitals in Hong Kong also showed seroconversion to the virus. 
The five patients who has not shown serological or virological evidence of hSARS virus 
infection, need to have later convalescent sera tested to define if they are also 

20 seroconverted. However, the concordance of the hSARS virus with the clinical definition 
of SARS appears remarkable, given that clinical case definitions are never perfect. 

No evidence of HMPV infection, either by RT-PCR or rising antibody titer 
against HMPV, was detected in any of these patients. No other pathogen was 
consistently detected in our group of patients with SARS. It is therefore highly likely that 

25 that this hSARS virus is either the cause of SARS or a necessary pre-requisite for disease 
progression. The issue of whether or not other microbial or other co-factors play a role in 
the progression of the disease remains to be investigated. 

The family Coronaviridae includes the genus Coronavirus and Torovirus. They 
are enveloped RNA viruses which cause disease in humans and animals. The previously 

30 known human coronaviruses, types 229E and OC43, are the major causes of the common 
cold (Holmes KV. Coronaviruses. Eds Knipe DM, Howley PM Fields Virology, 4th 
Edition, Lippincott Williams & Wilkins, Philadelphia, pp. 1 1 87-1203). But, while they 
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can occasionally cause pneumonia in older adults, neonates or immunocompromised 
patient (El-Sahly HM, Atmar RL, Glezen WP, Greenberg SB. Spectrum of clinical illness 
in hospitalizied patients with "common cold" virus infections. Clin Infect Dis. 2000; 31: 
96-100; and Foltz EX, Elkordy MA. Coronavirus pneumonia following autologous bone 
5 marrow transplantation for breast cancer. Chest 1999; 115: 901-905), coronaviruses have 
been reported to be an important cause of pneumonia in military recruits, accounting for 
up to 30% of cases in some studies (Wenzel RP, Hendley JO, Davies JA, Gwaltney JM, 
Coronavirus infections in military recruits: Three-year study with coronavirus strains 
OC43 and 229E. Am Rev Respir Dis. 1974; 109: 621-624). Human coronaviruses can 

1 0 infect neurons and viral RNA has been detected in the brain of patients with multiple 

sclerosis (Talbot PJ, Cote G, Arbour N. Human coronavirus OC43 and 229E persistence 
in neural cell cultures and human brains. A dv Exp Med Biol. - in press). On the other 
hand, a number of animal coronaviruses (e.g. Porcine Transmissible Gastroenteritis Virus, 
Murine Hepatitis Virus, Avian Infectious Bronchititis Virus) cause respiratory, 

1 5 gastrointestinal, neurological or hepatic disease in their respective hosts (Mcintosh K. 
Coronaviruses: a comparative review. Current Top Microbiol Immunol 1974; 63: 85- 
112). 

We describe for the first time the clinical presentation and complications of SARS. 
Less than 25% of patients with coronaviral pneumonia had upper respiratory tract 

20 symptoms. As expected in atypical pneumonia, both respiratory symptoms and positive 
auscultatory findings were very disproportional to the chest radiographic findings. 
Gastrointestinal symptoms were present in 10%. It is relevant that the virus RNA is 
detected in the stool sample of some patients and that coronaviruses have been associated 
with diarrhoea in animals and humans (Caul EO, Egglestone SI. Further studies on 

25 human enteric coronaviruses Arch Virol 1977; 54: 107-17). The high incidence of 
deranged liver function, leucopenia, significant lymphopenia, thrombocytopenia and 
subsequent evolution into adult respiratory distress syndrome suggests a severe systemic 
inflammatory damage induced by this hSARS virus. Thus immuno-modulation by 
steroid may be important to complement the antiviral therapy by ribavirin. In this regard, 

30 it is pertinent that severe human disease associated with the avian influenza subtype 
H5N1, which is another virus that recently crossed from animals to humans, has also 
been postulated to have an immuno-pathological component (Cheung CY, Poon LLM, 
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Lau AS Y et al. Induction of proinflammatory cytokines in human macrophages by 
influenza A (H5N1) viruses: a mechanism for the unusual severity of human disease. 
Lancet 2002; 360: 183 1-1837). In common with H5N1 disease, patients with severe 
SARS are adults, are significantly more lymphopenic and have parameters of organ 
5 dysfunction beyond the respiratory tract (Table 4) (Yuen KY, Chan PKS, Peiris JSM, et al. 
Clinical features and rapid viral diagnosis of human disease associated with avian influenza 
A H5N1 virus. Lancet 1998;351:467-471). It is important to note that a window of 
opportunity of around 8 days exists from the onset of symptoms to respiratory failure. 
Severe complicated cases are strongly associated with both underlying disease and 

1 0 delayed use of ribavirin and steroid therapy. Following our clinical experience in the 
initial cases, this combination therapy was started very early in subsequent cases which 
were largely uncomplicated cases at the time of admission. The overall mortality at the 
time of writing is only 2% with this treatment regimen. There were still 8 out of 19 
complicated cases who had not shown significant response. It is not possible to perform 

15 a detail analysis of the therapeutic response to this combination regimen due to the 
heterogeneous dosing and time of initiation of therapy. 

Other factors associated with severe disease is acquisition of the disease through 
household contact which may be attributed to a higher dose or duration of viral exposure 
and the presence of underlying diseases. 

20 The clinical description reported here pertains largely to the more severe cases 

admitted to hospitals. We presently have no data on the full clinical spectrum of the 
emerging Coronaviridae infection in the community or in an out-patient-setting. The 
availability of diagnostic tests as described here will help address these questions. In 
addition, it will allow questions pertaining to the period of virus shedding (and 

25 communicability) during convalescence, the presence of virus in other body fluids and 
excreta, and the presence of virus shedding during the incubation period to be addressed. 

The epidemiological data at present appears to indicate that the virus is spread by 
droplets or by direct and indirect contact although airborne spread cannot be ruled out in 
some instances. The finding of infectious virus in the respiratory tract supports this 

30 contention. Preliminary evidence also suggests that the virus may be shed in the feces. 
However, it is important to note that detection of viral RNA does not prove that the virus 
is viable or transmissible. If viable virus is detectable in the feces, this would be a 
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potentially additional route of transmission that needs to be considered. It is relevant to 
note that a number of animal coronaviruses are spread via the fecal-oral route (Mcintosh 
K. Coronaviruses: a comparative review. Current Top Microbiol Immunol. 1974; 63: 85- 
112). 

5 In conclusion, this report provides evidence that a virus in the Coronaviridae 

family is the etiological agent of SAKS. The present invention discloses a quantitative 
diagnostic assay that is rapid, sensitive and specific identification of the hSARS virus. 

7. DEPOSIT 

A sample of isolated hSARS virus was deposited with China Center for Type 
1 0 Culture Collection (CCTCC) at Wuhan University, Wuhan 43 0072 in China on April 2, 
2003 in accordance with the Budapest Treaty on the Deposit of Microorganisms, and 
accorded accession No. CCTCC-V200303, which is incorporated herein by reference in 
its entirety. 

8. MARKET POTENTIAL 

1 5 The hSARS virus can now be grown on a large scale, which allows the 

development of various diagnostic tests as described hereinabove as well as the 
development of vaccines and antiviral agents that are effective in preventing, 
ameliorating or treating SARS. Given the severity of the disease and its rapid global 
spread, it is highly likely that significant demands for diagnostic tests, therapies and 

20 vaccines to battle against the disease, will arise on a global scale. In addition, this virus 
contains genetic information which is extremely important and valuable for clinical and 
scientific research applications. 

9. EQUIVALENTS 

Those skilled in the art will recognize, or be able to ascertain many equivalents to 
25 the specific embodiments of the invention described herein using no more than routine 
experimentation. Such equivalents are intended to be encompassed by the following 
claims. 

All publications, patents and patent applications mentioned in this specification 
are incorporated herein by reference in their entireties into the specification to the same 
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extent as if each individual publication, patent or patent application was specifically and 
individually indicated to be incorporated herein by reference in its entirety. 

Citation or discussion of a reference herein shall not be construed as an admission 
that such is prior art to the present invention. 
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WHAT IS CLAIMED; 

1 . An isolated nucleic acid molecule consisting essentially of the nucleic acid sequence 
of SEQ ID NO:2471, 2472, or a complement thereof. 

2. An isolated nucleic acid molecule consisting essentially of the nucleic acid sequence 
of SEQ ID NO:2474, 2475, or a complement thereof 

3. An isolated nucleic acid molecule consisting essentially of the nucleic acid sequence 
of SEQ ID NO:2473, 2476, or a complement thereof. 

4. An isolated nucleic acid molecule which hybridizes under stringent conditions to a 
nucleic acid molecule having the nucleic acid sequence of SEQ ID NO:2471, 2472, 2473, 
2474, 2475 or 2476, or a complement thereof. 

5. An isolated polypeptide encoded by the nucleic acid molecule of any one of claims 
1-4. 

6. An antibody or an antigen-binding fragment thereof which immunospecifically 
binds to a peptide encoded by the nucleic acid sequence of SEQ ID NO:2471, 2472 or 2473. 

7. An antibody or an antigen-binding fragment thereof which immunospecifically 
binds to a peptide encoded by the nucleic acid sequence of SEQ ID NO:2474, 2475 or 2476. 

8. A method for detecting the presence of the hSARS virus in a biological sample, said 
method comprising: 

(a) amplifying a nucleic acid of the hS ARS virus using primers having the 
nucleic acid sequence of SEQ ID NOS:2471 and/or 2472; and 

(b) detecting in the nucleic acid using a probe having the nucleic acid sequence 
of SEQIDNO:2473. 

9. A method for detecting the presence of the hSARS virus in a biological sample, said 
method comprising: 
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(a) amplifying a nucleic acid of the hSARS virus using primers having the 
nucleic acid sequence of SEQ ID NOS:2474 and/or 2475; and 

(b) detecting in the nucleic acid using a probe having the nucleic acid sequence 
ofSEQIDNO:2476, 

10. A method for identifying a subject infected with the hSARS virus, said method 
comprising: 

(a) obtaining total RNA from a biological sample obtained from the subject; 

(b) reverse transcribing the total RNA to obtain cDNA; and 

(c) subjecting the cDNA to PCR assay using a set of primers derived from a 
nucleotide sequence of the hSARS. 

11. A method for identifying a subject infected with the hSARS virus, said method 
comprising: 

(a) obtaining total RNA from a biological sample obtained from the subject 

(b) reverse transcribing the total RNA to obtain cDNA; and 

(c) subjecting the cDNA to PCR assay using a set of primers having the nucleic 
acid sequence of SEQ ID NOS:2471 and/or 2472. 

12. The method of claim 1 1 further comprising (d) detecting a product of PCR assay 
with a probe. 

13. The method of claim 12, wherein the probe is a nucleic acid molecule having the 
nucleotide sequence of SEQ ID NO:2473. 

14. A method for identifying a subject infected with the hSARS virus, said method 
comprising: 

(a) obtaining total RNA from a biological sample obtained from the subject 

(b) reverse transcribing the total RNA to obtain cDNA; and 

(c) subjecting the cDNA to PCR assay using a set of primers having the nucleic 
acid sequence of SEQ ID NOS:2474 and/or 2475. 
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15. The method of claim 14 further comprising (d) detecting a product of PGR assay 
with a probe. 

16. The method of claim 15 3 wherein the probe is a nucleic acid molecule having the 
nucleotide sequence of SEQ ID NO:2476. 

17. A kit comprising in one or more containers one or more isolated nucleic acid 
molecules comprising a nucleotide sequence selected from the group consisting of SEQ ID 
NO:2471, SEQ ID NO:2472, and SEQ ID NO:2473. 

18. A kit comprising in one or more containers one or more isolated nucleic acid 
molecules comprising a nucleotide sequence selected from the group consisting of SEQ ID 
NO:2474, SEQ ID NO:2475, and SEQ ID NO:2476. 
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FIG. 5 A 
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FIG. 7A 
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t aaa tgt agt aga ate ata cct gcg cgt gcg cgc gta gag tgt ttt gat 4 9 
Lys Cys Ser Arg He lie Pro Ala Arg Ala Arg Val Glu Cys Phe Asp 
1 5 10 15 

aaa ttc aaa gtg aat tea aca eta gaa cag tat gtt ttc tgc act gta 97 
Lys Phe Lys Val Asn Ser Thr Leu Glu Gin Tyr Val Phe Cys Thr Val 
20 25 30 

aat gca ttg cca gaa aca act get gac att gta gtc ttt gat gaa ate 145 
Asn Ala Leu Pro Glu Thr Thr Ala Asp He Val Val Phe Asp Glu He 
35 40 45 

tct atg get act aat tat gac ttg agt gtt gtc aat get aga ctt cgt 193 
Ser Met Ala Thr Asn Tyr Asp Leu Ser Val Val Asn Ala Arg Leu Arg 
50 55 60 

gca aaa cac tac gtc tat att ggc gat cct get caa tta cca gee ccc 241 
Ala Lys His Tyr Val Tyr He Gly Asp Pro Ala Gin Leu Pro Ala Pro 
65 70 75 80 

cgc aca ttg ctg act aaa ggc aca eta gaa cca gaa tat ttt aat tea 289 
Arg Thr Leu Leu Thr Lys Gly Thr Leu Glu Pro Glu Tyr Phe Asn Ser 
85 90 "* 95 

gtg tgc aga ctt atg aaa aca ata ggt cca gac atg ttc ctt gga act 337 
Val Cys Arg Leu Met Lys Thr He Gly Pro Asp Met Phe Leu Gly Thr 
100 105 110 

tgt cgc cgt tgt cct get gaa att gtt gac act gtg agt get tta gtt 385 
Cys Arg Arg Cys Pro Ala Glu He Val Asp Thr Val Ser Ala Leu Val 
115 120 125 

tat gac aat aag eta aaa gca cac aag gag aag tea get caa tgc ttc 433 
Tyr Asp Asn Lys Leu Lys Ala His Lys Glu Lys Ser Ala Gin Cys Phe 
130 135 140 

aaa atg ttc tac aaa ggt gtt att aca cat gat gtt tea tct gca ate 481 
Lys Met Phe Tyr Lys Gly Val He Thr His Asp Val Ser Ser Ala He 
145 150 155 160 

aac aga cct caa ata ggc gtt gta aga gaa ttt ctt aca cgc aat cct 529 
Asn Arg Pro Gin He Gly Val Val Arg Glu Phe Leu Thr Arg Asn Pro 

165 170 175 

get tgg aga aaa get gtt ttt ate tea cct tat aat tea cag aac get 577 
Ala Trp Arg Lys Ala Val Phe He Ser Pro Tyr Asn Ser Gin Asn Ala 
180 185 190 

gta get tea aaa ate tta gga ttg cct acg cag act gtt gat tea tea 625 
Val Ala Ser Lys He Leu Gly Leu Pro Thr Gin Thr Val Asp Ser Ser 
195 200 205 

cag ggt tct gaa tat gac tat gtc ata ttc aca caa act act gaa aca 673 
Gin Gly Ser Glu Tyr Asp Tyr Val He Phe Thr Gin Thr Thr Glu Thr 
210 215 220 
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gca cac tct tgt aat gtc aac cgc ttc aat gtg get ate aca agg gca 721 
Ala His Ser Cys Asn Val Asn Arg Phe Asn Val Ala He Thr Arg Ala 

225 230 235 240 

aaa att ggc att ttg tgc ata atg tct gat aga gat ctt tat gac aaa 769 
Lys He Gly He Leu Cys He Met Ser Asp Arg Asp Leu Tyr Asp Lys 
245 250 255 

ctg caa ttt aca agt eta gaa ata cca cgt cgc aat gtg get aca tta 817 
Leu Gin Phe Thr Ser Leu Glu He Pro Arg Arg Asn Val Ala Thr Leu 
260 265 270 

caa gca gaa aat gta act gga ctt ttt aag gac tgt agt aag ate att 8 65 
Gin Ala Glu Asn Val Thr Gly Leu Phe Lys Asp Cys Ser Lys He He 
275 280 285 

act ggt ctt cat cct aca cag gca cct aca cac etc age gtt gat ata 913 
Thr Gly Leu His Pro Thr Gin Ala Pro Thr His Leu Ser Val Asp He 
290 295 300 

aaa ttc aag act gaa gga tta tgt gtt gac ata cca ggc ata cca aag 961 
Lys Phe Lys Thr Glu Gly Leu Cys Val Asp He Pro Gly He Pro Lys 
305 310 315 320 

gac atg ace tac cgt aga etc ate tct atg atg ggt ttc aaa atg aat 1009 
Asp Met Thr Tyr Arg Arg Leu He Ser Met Met Gly Phe Lys Met Asn 
325 330 335 

tac caa gtc aat ggt tac cct aat atg ttt ate ace cgc gaa gaa get 1057 
Tyr Gin Val Asn Gly Tyr Pro Asn Met Phe He Thr Arg Glu Glu Ala 
340 345 350 

att cgt cac gtt cgt gcg tgg att ggc ttt gat gta gag ggc tgt cat 1105 
He Arg His Val Arg Ala Trp He Gly Phe Asp Val Glu Gly Cys His 
355 360 365 

gca act aga gat get gtg ggt act aac eta cct etc cag eta gga ttt 1153 
Ala Thr Arg Asp Ala Val Gly Thr Asn Leu Pro Leu Gin Leu Gly Phe 
370 " 375 380 

tct aca ggt gtt aac tta gta get gta ccg act ggt tat gtt gac act 1201 
Ser Thr Gly Val Asn Leu Val Ala Val Pro Thr Gly Tyr Val Asp Thr 
385 390 395 400 

gaa aat aac eta 1213 

Glu Asn Asn Leu 
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c aga acc atg cct aac atg ctt agg ata atg gcc tct ctt gtt ctt get 49 

Arg Thr Met Pro Asn Met Leu Arg He Met Ala Ser Leu Val Leu Ala 

15 io 15 

cgc aaa cat aac act tgc tgt aac tta tea cac cgt ttc tac agg tta 97 
Arg Lys His Asn Thr Cys Cys Asn Leu Ser His Arg Phe Tyr Arg Leu 
20 25 30 

get aac gag tgt gcg caa gta tta agt gag atg gtc atg tgt ggc ggc 145 
Ala Asn Glu Cys Ala Gin Val Leu Ser Glu Met Val Met Cys Gly Gly 
35 40 45 

tea eta tat gtt aaa cca ggt gga aca tea tec ggt gat get aca act 193 
Ser Leu Tyr Val Lys Pro Gly Gly Thr Ser Ser Gly Asp Ala Thr Thr 
50 55 60 

get tat get aat agt gtc ttt aac att tgt caa get gtt aca gcc aat 241 
Ala Tyr Ala Asn Ser Val Phe Asn He Cys Gin Ala Val. Thr Ala Asn 
65 70 75 80 

gta aat gca ctt ctt tea act gat ggt aat aag ata get gac aag tat 28 9 
Val Asn Ala Leu Leu Ser Thr Asp Gly Asn Lys He Ala Asp Lys Tyr 
85 90 " 95 

gtc cgc aat eta caa cac agg etc tat gag tgt etc tat aga aat agg 337 
Val Arg Asn Leu Gin His Arg Leu Tyr Glu Cys Leu Tyr Arg Asn Arg 
100 105 110 

gat gtt gat cat gaa ttc gtg gat gag ttt tac get tac ctg cgt aaa 385 
Asp Val Asp His Glu Phe Val Asp Glu Phe Tyr Ala Tyr Leu Arg Lys 
115 120 125 

cat ttc tec atg atg att ctt tct gat gat gcc gtt gtg tgc tat aac 433 
His Phe Ser Met Met He Leu Ser Asp Asp Ala Val Val Cys Tyr Asn 
130 135 140 

agt aac tat gcg get caa ggt tta gta get age att aag aac ttt aag 481 
Ser Asn Tyr Ala Ala Gin Gly Leu Val Ala Ser He Lys Asn Phe Lys 
145 150 155 160 

gca gtt ctt tat tat caa aat aat gtg ttc atg tct gag gca aaa tgt 52 9 
Ala Val Leu Tyr Tyr Gin Asn Asn Val Phe Met Ser Glu Ala Lys Cys 
165 170 S 175 

tgg act gag act gac ctt act aaa gga cct cac gaa ttt tgc tea cag 577 
Trp Thr Glu Thr Asp Leu Thr Lys Gly Pro His Glu Phe Cys Ser Gin 
180 185 190 

cat aca atg eta gtt aaa caa gga gat gat tac gtg tac ctg cct tac 625 
His Thr Met Leu Val Lys Gin Gly Asp Asp Tyr Val Tyr Leu Pro Tyr 
195 200 205 

cca gat cca tea aga ata tta ggc gca ggc tgt ttt gtc gat gat att 673 
Pro Asp Pro Ser Arg He Leu Gly Ala Gly Cys Phe Val Asp Asp He 
210 215 220 

gtc aaa cag atg gta cac tta tga ttg aaa ggt tec gtg tea ctg get 721 
Val Lys Gin Met Val His Leu 
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1 atattaggtt tttacctacc caggaaaagc caaccaacct cgatctcttg tagatctgtt 

61 ctctaaacga actttaaaat ctgtgtagct gtcgcccggc tgcatgccta gtgcacctac 

121 gcagtataaa caataataaa ttttactgtc gttgacaaga aacgagtaac tcgtccctct 

181 tctgcagact gcttacggtt tcgtccgtgt tgcagtcgat catcagcata cctaggtttc 

241 gtccgggtgt gaccgaaagg taagatggag agccttgttc ttggtgtcaa cgagaaaaca 

301 cacgtccaac tcagtttgcc tgtccttcag gttagagacg tgctagtgcg tggcttcggg 

361 gactctgtgg aagaggccct atcggaggca cgtgaacacc tcaaaaatgg cacttgtggt 

421 ctagtagagc tggaaaaagg cgtactgccc cagcttgaac agccctatgt gttcattaaa 

481 cgttctgatg ccttaagcac caatcacggc cacaaggtcg ttgagctggt tgcagaaatg 

541 gacggcattc agtacggtcg tagcggtata acactgggag tactcgtgcc acatgtgggc 

601 gaaaccccaa ttgcataccg caatgttctt cttcgtaaga acggtaataa gggagccggt 

661 ggtcatagct atggcatcga tctaaagtct tatgacttag gtgacgagct tggcactgat 

721 cccattgaag attatgaaca aaactggaac actaagcatg gcagtggtgc actccgtgaa 

7 81 ctcactcgtg agctcaatgg aggtgcagtc actcgotatg tcgacaacaa tttctgtggc 

841 ccagatgggt accctcttga ttgcatcaaa gattttctcg cacgcgcggg caagtcaatg 

901 tgcactcttt ccgaacaact tgattacatc gagtcgaaga gaggtgtcta ctgctgccgt 

961 gaccatgagc atgaaattgc ctggttcact gagcgctctg ataagagcta cgagcaccag 

1021 acacccttcg aaattaagag tgccaagaaa tttgacactt tcaaagggga atgcccaaag 

1081 tttgtgtttc ctcttaactc aaaagtcaaa gtcatlzcaac cacgtgttga aaagaaaaag 

1141 actgagggtt tcatggggcg tatacgctct gtgtaccctg ttgcatctcc acaggagtgt 

1201 aacaatatgc acttgtctac cttgatgaaa tgtaatcatt gcgatgaagt ttcatggcag 

1261 acgtgcgact ttctgaaagc cacttgtgaa cattgtggca ctgaaaattt agttattgaa 

1321 ggacctacta catgtgggta cctacctact aatgctgtag tgaaaatgcc atgtcctgcc 

1381 tgtcaagacc cagagattgg acctgagcat agtgttgcag attatcacaa ccactcaaac 

1441 attgaaactc gactccgcaa gggaggtagg actagatgtt ttggaggctg tgtgtttgcc 

1501 tatgttggct gctataataa gcgtgcctac tgggttcctc gtgctagtgc tgatattggc 

1561 tcaggccata ctggcattac tggtgacaat gtggagacct tgaatgagga tctccttgag 

1621 atactgagtc gtgaacgtgt taacattaac attgttggcg attttcattt gaatgaagag 

1681 gttgccatca ttttggcatc tttctctgct tctacaagtg cctttattga cactataaag 

1741 agtcttgatt acaagtcttt caaaaccatt gttgagtcct gcggtaacta taaagttacc 

1801 aagggaaagc ccgtaaaagg tgcttggaac attggacaac agagatcagt tttaacacca 

18 61 ctgtgtggtt ttccctcaca ggctgctggt gttatcagat caatttttgc gcgcacactt 

1921 gatgcagcaa accactcaat tcctgatttg caaagagcag ctgtcaccat acttgatggt 

1981 atttctgaac agtcattacg tcttgtcgac gccatggttt atacttcaga cctgctcacc 

2041 aacagtgtca ttattatggc atatgtaact ggtggtcttg tacaacagac ttctcagtgg 

2101 ttgtctaatc ttttgggcac tactgttgaa aaactcaggc ctatctttga atggattgag 

2161 gcgaaactta gtgcaggagt tgaatttctc aaggatgctt gggagattct caaatttctc 

2221 attacaggtg tttttgacat cgtcaagggt caaatacagg ttgcttcaga taacatcaag 

2281 gattgtgtaa aatgcttcat tgatgttgtt aacaaggcac tcgaaatgtg cattgatcaa 

2341 gtcactatcg ctggcgcaaa gttgcgatca ctcaacttag gtgaagtctt catcgctcaa 

24 01 agcaagggac tttaccgtca gtgtatacgt ggcaaggagc agctgcaact actcatgcct 

24 61 cttaaggcac caaaagaagt aacctttctt gaaggtgatt cacatgacac agtacttacc 

2521 fcctgaggagg ttgttctcaa gaacggtgaa ctcgaagcac tcgagacgcc cgttgatagc 

2581 ttcacaaatg gagctatcgt cggcacacca gtctgtgtaa atggcctcat gctcttagag 

2641 attaaggaca aagaacaata ctgcgcattg tctcctggtt tactggctac aaacaatgtc 

2701 tttcgcttaa aagggggtgc accaattaaa ggtgtaacct ttggagaaga tactgtttgg 

2761 gaagttcaag gttacaagaa tgtgagaatc acatttgagc ttgatgaacg tgttgacaaa 

2821 gtgcttaatg aaaagtgctc tgtctacact gttgaatccg gtaccgaagt tactgagttt 

2881 gcatgtgttg tagcagaggc tgttgtgaag actttacaac cagtttctga tctccttacc 

2 941 aacatgggta ttgatcttga tgagtggagt gtagctacat tctacttatt tgatgatgct 

3001 ggtgaagaaa acttttcatc acgtatgtat tgttcctttt accctccaga tgaggaagaa 

3061 gaggacgatg cagagtgtga ggaagaagaa attgatgaaa cctgtgaaca tgagtacggt 

3121 acagaggatg attatcaagg tctccctctg gaatttggtg cctcagctga aacagttcga 

3181 gttgaggaag aagaagagga agactggctg gatgatacta ctgagcaatc agagattgag 

3241 ccagaaccag aacctacacc tgaagaacca gttaatcagt ttactggtta tttaaaactt 

3301 actgacaatg ttgccattaa atgtgttgac atcgttaagg aggcacaaag tgctaatcct 
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3361 atggtgattg taaatgctgc taacatacac ctgaaacatg gtggtggtgt agcaggtgca 
3421 ctcaacaagg caaccaatgg tgccatgcaa aaggagagtg atgattacat taagctaaat 
3481 ggccctctta cagtaggagg gtcttgtttg ctttctggac ataatcttgc taagaagtgt 
3541 ctgcatgttg ttggacctaa cctaaatgca ggtgaggaca tccagcttct taaggcagca 
3 601 tatgaaaatt tcaattcaca ggacatctta cttgcaccat tgttgtcagc aggcatattt 
3661 ggtgctaaac cacttcagtc tttacaagtg tgcgtgcaga cggttcgtac acaggtttat 
3721 attgcagtca atgacaaagc tctttatgag caggttgtca tggattatct tgataacctg 
3781 aagcctagag tggaagcacc taaacaagag gagccaccaa acacagaaga ttccaaaact 
3841 gaggagaaat ctgtcgtaca gaagcctgtc gatgtgaagc caaaaattaa ggcctgcatt 

3 901 gatgaggtta ccacaacact ggaagaaact aagtttctta ccaataagtt actcttgttt 
3961 gctgatatca atggtaagct ttaccatgat tctcagaaca tgcttagagg tgaagatatg 
4021 tctttccttg agaaggatgc accttacatg guagg-gatg ttatcactag tggtgatatc 

4 081 acttgtgttg taataccctc caaaaaggct ggtggcacta ctgagatgct ctcaagagct 
4141 ttgaagaaag tgccagttga tgagtatata accacgtacc ctggacaagg atgtgctggt 
4201 tatacacttg aggaagctaa gactgctctt aagaaatgca aatctgcatt ttatgtacta 
4261 ccttcagaag cacctaatgc taaggaagag attctaggaa ctgtatcctg gaatttgaga 
4321 gaaatgcttg ctcatgctga agagacaaga aaattaatgc ctatatgcat ggatgttaga 
4381 gccataatgg caaccatcca acgtaagtat aaaggaatta aaattcaaga gggcatcgtt 
4441 gactatggtg tccgattctt cttttatact agtaaagagc ctgtagcttc tattattacg 
4501 aagctgaact ctctaaatga gccgcttgtc acaatgccaa ttggttatgt gacacatggt 
4561 tttaatcttg aagaggctgc gcgctgtatg cgttctctta aagctcctgc cgtagtgtca 
4 621 gtatcatcac cagatgctgt tactacatat aatggatacc tcacttcgtc atcaaagaca 
4 681 tctgaggagc actttgtaga aacagtttct ttggctggct cttacagaga ttggtcctat 
4741 tcaggacagc gtacagagtt aggtgttgaa tttcttaagc gtggtgacaa aattgtgtac 
4801 cacactctgg agagccccgt cgagtttcat cttgacggtg aggttctttc acttgacaaa 
4861 ctaaagagtc tcttatccct gcgggaggtt aagactataa aagtgttcac aactgtggac 
4921 aacactaatc tccacacaca gcttgtggat atgtctatga catatggaca gcagtttggt 
4981 ccaacatact tggatggtgc tgatgttaca aaaattaaac ctcatgtaaa tcatgagggt 
5041 aagactttct ttgtactacc tagtgatgac acactacgta gtgaagcttt cgagtactac 
5101 catactcttg atgagagttt tcttggtagg tacatgtctg ctttaaacca cacaaagaaa 
5161 tggaaatttc ctcaagttgg tggtttaact tcaattaaat gggctgataa caattgttat 
5221 ttgtctagtg ttttattagc acttcaacag cttgaagtca aattcaatgc accagcactt 
5281 caagaggctt attatagagc ccgtgctggt gatgctgcta acttttgtgc actcatactc 
5341 gcttacagta ataaaactgt tggcgagctt ggtgatgtca gagaaactat gacccatctt 
54 01 ctacagcatg ctaatttgga atctgcaaag cgagttctta atgtggtgtg taaacattgt 
54 61 ggtcagaaaa ctactacctt aacgggtgta gaagctgtga tgtatatggg tactctatct 
5521 tatgataatc ttaagacagg tgtttccatt ccatgtgtgt gtggtcgtga tgctacacaa 
5581 tatctagtac aacaagagtc ttcttttgtt atgatgtctg caccacctgc tgagtataaa 
5641 ttacagcaag gtacattctt atgtgcgaat gagtacactg gtaactatca gtgtggtcat 
57 01 tacactcata taactgctaa ggagaccctc tatcgtattg acggagctca ccttacaaag 
5761 atgtcagagt acaaaggacc agtgactgat gttttctaca aggaaacatc ttacactaca 
5821 accatcaagc ctgtgtcgta taaactcgafc ggagttactt acacagagat tgaaccaaaa 
5881 ttggatgggt attataaaaa ggataatgct tactatacag agcagcctat agaccttgta 
5941 ccaactcaac cattaccaaa tgcgagtttt gataatttca aactcacatg ttctaacaca 
6001 aaatttgctg atgatttaaa tcaaatgaca ggcttcacaa agccagcttc acgagagcta 
6061 tctgtcacat tcttcccaga cttgaatggc gatgtagtgg ctattgacta tagacactat 
6121 tcagcgagtt tcaagaaagg tgctaaatta ctgcataagc caattgtttg gcacattaac 
6181 caggctacaa ccaagacaac gttcaaacca aacacttggt gtttacgttg tctttggagt 
6241 acaaagccag tagatacttc aaattcattt gaagttctgg cagtagaaga cacacaagga 
6301 atggacaatc ttgcttgtga aagtcaacaa cccacctctg aagaagtagt ggaaaatcct 

63 61 accatacaga aggaagtcat agagtgtgac gtgaaaacta ccgaagttgt aggcaatgtc 
6421 atacttaaac catcagatga aggtgttaaa gtaacacaag agttaggtca tgaggatctt 

64 81 atggctgctt atgtggaaaa cacaagcatt accattaaga aacctaatga gctttcacta 
6541 gccttaggtt taaaaacaat tgccactcat ggtattgctg caattaatag tgttccttgg 
6601 agtaaaattt tggcttatgt caaaccattc ttaggacaag cagcaattac aacatcaaat 
6661 tgcgctaaga gattagcaca acgtgtgttt aacaattata tgccttatgt gtttacatta 
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6721 ttgttccaat fcgtgtacttt tactaaaagt accaattcta gaattagagc ttcactacct 
6781 acaactattg ctaaaaatag tgttaagagt gttgctaaat tatgtttgga tgccggcatt 
6841 aattatgtga agtcacccaa attttctaaa ttgttcacaa tcgctatgtg gctattgttg 
6901 ttaagtattt gcttaggttc tctaatctgt gtaactgctg cttttggtgt actcttatct 
6961 aattttggtg ctccttctta ttgtaatggc gttagagaat tgtatc^taa ttcgtctaac 
7021 gttactacta tggatttctg tgaaggttct tttccttgca gcatttgttt aagtggatta 
7081 gactcccttg attcttatcc agctcttgaa accattcagg tgacgatttc atcgtacaag 
7141 ctagacttga caattttagg tctggccgct gagtgggttt tggcatatat gttgttcaca 
7201 aaattctttt atttattagg tctttcagct ataatgcagg tgttctttgg ctattttgct 
72 61 agtcatttca tcagcaattc ttggctcatg tggtttatca ttagtattgt acaaatggca 
7321 cccgtttctg caatggttag gatgtacatc ttctttgctt ctttctacta catatggaag 
7381 agctatgttc atatcatgga tggttgcacc tcttcgactt gcatgatgtg ctataagcgc 
7 441 aatcgtgcca cacgcgttga gtgtacaact attgttaatg gcatgaagag atctttctat 
7501 gtctatgcaa atggaggccg tggcttctgc aagac.caca attggaattg tctcaattgt 
7561 gacacatttt gcactggtag tacattcatt agtgatgaag ttgctcgtga tttgtcactc 
7 621 cagtttaaaa gaccaatcaa ccctactgac cagtcatcgt atattgttga tagtgttgct 
7 681 gtgaaaaatg gcgcgcttca cctctacttt gacaaggctg gtcaaaagac ctatgagaga 
7741 catccgctct cccattttgt caatttagac aatttgagag ctaacaacac taaaggttca 
7801 ctgcctatta atgtcatagt ttttgatggc aagtccaaat gcgacgagtc tgcttctaag 
7861 tctgcttctg tgtactacag tcagctgatg tgccaaccta ttctgttgct tgaccaagct 
7921 cttgtatcaa acgttggaga tagtactgaa gtttccgtta agatgtttga tgcttatgtc 
7981 gacacctttt cagcaacttt tagtgttcct atggaaaaac ttaaggcact tgttgctaca 
8041 gctcacagcg agttagcaaa gggtgtagct ttagatggtg tcctttctac attcgtgtca 
8101 gctgcccgac aaggtgttgt fcgataccgat gttgacacaa aggatgttat tgaatgtctc 
8161 aaactttcac atcactctga cttagaagtg acaggtgaca gttgtaacaa tttcatgctc 
8221 acctataata aggttgaaaa catgacgccc agagatcttg gcgcatgtat tgactgtaat 
8281 gcaaggcata tcaatgccca agtagcaaaa agtcacaatg tttcactcat ctggaatgta 
8341 aaagactaca tgtctttatc tgaacagctg cgtaaacaaa ttcgtactgc tgccaagaag 
8401 aacaacatac cttttacact aacttgtgct acaactagac aggttgtcaa tgtcataact 
8461 actaaaatct cactcaaggg tggtaagatt gttagtactt gttttaaact tatgcttaag 
8521 gccacattat tgtgcgttct tgctgcattg gtttgttata tcgttatgcc agtacataca 
8581 ttgtcaatcc atgatggtta cacaaatgaa atcattggtt acaaagccat tcaggatggt 
8641 gtcactcgtg acatcatttc tactgatgat tgttttgcaa ataaacatgc tggttttgac 
8701 gcatggttta gccagcgtgg tggttcatac aaaaatgaca aaagctgccc tgtagtagct 
87 61 gctatcatta caagagagat tggtttcata gtgcctggct taccgggtac tgtgctgaga 
8821 gcaatcaatg gtgacttctt gcattttcta cctcgtgttt ttagtgctgt tggcaacatt 
8881 tgctacacac cttccaaact cattgagtat agtgattttg ctacctctgc ttgcgttctt 
8941 gctgctgagt gtacaatttt taaggatgct atgggcaaac ctgtgccata ttgttatgac 
9001 actaatttgc tagagggttc tatttcttat agtgagcttc gtccagacac tcgttatgtg 
9061 cttatggatg gttccatcat acagtfctcct aacacttacc tggagggttc tgttagagta 
9121 gtaacaactt ttgatgctga gtactgtaga catggtacat gcgaaaggtc agaagtaggt 
9181 atttgcctat ctaccagtgg tagatgggtt cttaataatg agcattacag agctctatca 
9241 ggagttttct gtggtgttga tgcgatgaat ctcatagcta acatctttac tcctcttgtg 
9301 caacctgtgg gtgctttaga tgtgtctgct tcagtagtgg ctggtggtat tattgccata 
9361 ttggtgactt gtgctgccta ctactttatg aaattcagac gtgtttttgg tgagtacaac 
9421 catgttgttg ctgctaatgc acttttgttt ttgatgtctt tcactatact ctgtctggta 
9481 ccagcttaca gctttctgcc gggagtctac tcagtctttt acttgtactt gacattctat 
9541 ttcaccaatg atgtttcatt cttggctcac cttcaatggt ttgccatgtt ttctcctatt 
9601 gtgccttttt ggataacagc aatctatgta ttctgtattt ctctgaagca ctgccattgg 
9661 ttctttaaca actatcttag gaaaagagtc atgtttaatg gagttacatt tagtaccttc 
9721 gaggaggctg ctttgtgtac ctttttgctc aacaaggaaa tgtacctaaa attgcgtagc 
9781 gagacactgt tgccacttac acagtataac aggtatcttg ctctatataa caagtacaag 
9841 tatttcagtg gagccttaga tactaccagc tatcgtgaag cagcttgctg ccacttagca 
9901 aaggctctaa atgactttag caactcaggt gctgatgttc tctaccaacc accacagaca 
9961 tcaatcactt ctgctgttct gcagagtggt tttaggaaaa tggcattccc gtcaggcaaa 
10021 gttgaagggt gcatggtaca agtaacctgt ggaactacaa ctcttaatgg attgtggttg 
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10081 gatgacacag tatactgtcc aagacatgtc atttgcacag cagaagacat gcttaatcct 

10141 aactatgaag atctgctcat tcgcaaatcc aaccatagct ttcttgntca ggctggcaat 

10201 gttcaacttc gtgttattgg ccattctatg caaaattgtc tgcttaggct taaagttgat 

10261 acttctaacc ctaagacacc caagtataaa tttgtccgta tccaacctgg tcaaacattt 

10321 tcagttctag catgctacaa tggttcacca tctggtgttt atcagtgtgc catgagacct 

10381 aatcatacca ttaaaggttc tttccttaat ggatcatgtg gtagtgttgg ttttaacatt 

10441 gattatgatt gcgtgtcttt ctgctatatg catcatatgg agcttccaac aggagtacac 

10501 gctggtactg acttagaagg taaattctat ggtccatttg ttgacagaca aactgcacag 

10561 gctgcaggta cagacacaac cataacatta aatgttttgg catggctgta tgctgctgtt 

10621 atcaatggtg ataggtggtt tcttaataga ttcaccacta ctttgaatga ctttaacctt 

10 681 gtggcaatga agtacaacta tgaacctttg acacaagatc atgttgacat attgggacct 

10741 ctttctgctc aaacaggaat tgccgtctta gatatgtgtg cfcgctttgaa agagctgctg 

10801 cagaatggta tgaatggtcg tactatcctt ggtagcacta ttttagaaga tgagtttaca 

10861 ccatttgatg ttgttagaca atgctctggt gttaccttcc aaggtaagtt caagaaaatt 

10921 gttaagggca ctcatcattg gatgctttta actttcttga catcactatt gattcttgtt 

10981 caaagtacac agtggtcact gtttttcttt gtttacgaga atgctttctt gccatttact 

11041 cttggtatta tggcaattgc tgcatgtgct atgctgcttg ttaagcataa gcacgcattc 

11101 ttgtgcttgt ttctgttacc ttctcttgca acagttgctt actttaatat ggtctacatg 

11161 cctgctagct gggtgatgcg tatcatgaca tggcttgaat tggctgacac tagcttgtct 

11221 ggttataggc ttaaggattg tgttatgtat gcttcagctt tagttttgct tattctcatg 

11281 acagctcgca ctgtttatga tgatgctgct agacgtgttt ggacactgat gaatgtcatt 

11341 acacttgttt acaaagtcta ctatggtaat gctttagatc aagctatttc catgtgggcc 

114 01 ttagttattt ctgtaacctc taactattct ggtgtcgtta cgactatcat gtttttagct 

11461 agagctatag tgtttgtgtg tgttgagtat tacccattgt tatttattac tggcaacacc 

11521 ttacagtgta tcatgcttgt ttattgtttc ttaggotatt gttgctgctg ctactttggc 

11581 cttttctgtt tactcaaccg ttacttcagg cttactcttg gtgtttatga ctacttggtc 

11641 tctacacaag aatttaggta tatgaactcc caggggcttt tgcctcctaa gagtagtatt 

11701 gatgctttca agcttaacat taagttgttg ggtattggag gtaaaccatg tatcaaggtt 

11761 gctactgtac agtctaaaat gtctgacgta aagtgoacat ctgtggtact gctctcggtt 

11821 cttcaacaac ttagagtaga gtcatcttct aaattgtggg cacaatgtgt acaactccac 

11881 aatgatattc ttcttgcaaa agacacaact gaagctttcg agaagatggt ttctcttttg 

11941 tctgttttgc tatccatgca gggtgctgta gacattaata ggttgtgcga ggaaatgctc 

12001 gataaccgtg ctactcttca ggctattgct tcagaattta gttctttacc atcatatgcc 

12061 gcttatgcca ctgcccagga ggcctatgag caggctgtag ctaatggtga ttctgaagtc 

12121 gttctcaaaa agttaaagaa atctttgaat gtggctaaat ctgagtttga ccgtgatgct 

12181 gccatgcaac gcaagttgga aaagatggca gatcaggcta tgacccaaat gtacaaacag 

122 41 gcaagatctg aggacaagag ggcaaaagta actagtgcta tgcaaacaat gctcttcact 

12301 atgcttagga agcttgataa tgatgcactt aacaacatta tcaacaatgc gcgtgatggt 

12361 tgtgttccac tcaacatcat accattgact acagcagcca aactcatggt tgttgtccct 

12421 gattatggta cctacaagaa cacttgtgat ggtaacacct ttacatatgc atctgcactc 

12481 tgggaaatcc agcaagttgt tgatgcggat agcaagattg ttcaacttag tgaaattaac 

12541 atggacaatt caccaaattt ggcttggcct cttattgtta cagctctaag agccaactca 

12601 gctgttaaac tacagaataa tgaactgagt ccagtagcac tacgacagat gtcctgtgcg 

12661 gctggtacca cacaaacagc ttgtactgat gacaatgcac ttgcctacta taacaattcg 

12721 aagggaggta ggtttgtgct ggcattacta tcagaccacc aagatctcaa atgggctaga 

12781 ttccctaaga gtgatggtac aggtacaatt tacacagaac tggaaccacc ttgtaggttt 

12841 gttacagaca caccaaaagg gcctaaagtg aaatacttgt acttcatcaa aggcttaaac 

12901 aacctaaata gaggtatggt gctgggcagt ttagctgcta cagtacgtct tcaggctgga 

12961 aatgctacag aagtacctgc caattcaact gtgctttcct tctgtgcttt tgcagtagac 

13021 cctgctaaag catataagga ttacctagca agtggaggac aaccaatcac caactgtgtg 

13081 aagatgttgt gtacacacac tggtacagga caggcaatta ctgtaacacc agaagctaac 

13141 atggaccaag agtcctttgg tggtgcttca tgttgtctgt attgtagatg ccacattgac 

13201 catccaaatc ctaaaggatt ctgtgacttg aaaggtaagt acgtccaaat acctaccact 

13261 tgtgctaatg acccagtggg ttttacactt agaaacacag tctgtaccgt ctgcggaatg 

13321 tggaaaggtt atggctgtag ttgtgaccaa ctccgcgaac ccttgatgca gtctgcggat 

13381 gcatcaacgt ttttaaacgg gtttgcggtg taagtgcagc ccgtcttaca ccgtgcggca 
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13441 caggcactag tactgatgtc gtctacaggg cttttgatat ttacaacgaa aaaagtgctg 
13501 gttttgcaaa gttcctaaaa actaattgct gtcgcttcca ggagaaggat gaggaaggca 
13561 atttattaga ctcttacttt gtagttaaga ggcatactat gtctaactac caacatgaag 
13621 agactattta taacttggtt aaagattgtc cagcggttgc tgtccatgac tttttcaagt 
13681 ttagagtaga tggtgacatg gtaccacata tatcacgtca gcgtctaact aaatacacaa 
13741 tggctgattt agtctatgct ctacgtcatt ttgatgaggg taattgtgat acattaaaag 
13801 aaatactcgt cacatacaat tgctgtgatg atgat^attt caataagaag gattggtatg 
13861 acttcgtaga gaatcctgac atcttacgcg tatafcgctaa cttaggtgag cgtgtacgcc 
13921 aatcattatt aaagactgta caattctgcg atgctatgcg tgatgcaggc attgtaggcg 
13981 tactgacatt agataatcag gatcttaatg ggaactggta cgatttcggt gatttcgtac 
14041 aagtagcacc aggctgcgga gttcctattg tggattcata ttactcattg ctgatgccca 
14101 tcctcacttt gactagggca ttggctgctg agtcccatat ggatgctgat ctcgcaaaac 
14161 cacttattaa gtgggatttg ctgaaatatg attttacgga agagagactt tgtctcttcg 
14221 accgttattt taaatattgg gaccagacat accatcccaa ttgtattaac tgtttggatg 
14281 ataggtgtat ccttcattgt gcaaacttta atgtgttatt ttctactgtg tttccaccta 
14341 caagttttgg accactagta agaaaaatat ttgtagatgg tgttcctttt gttgtttcaa 
14401 ctggatacca ttttcgtgag ttaggagtcg tacataatca ggatgtaaac ttacatagct 
144 61 cgcgtctcag tttcaaggaa cttttagtgt atgctgctga tccagctatg catgcagctt 
14521 ctggcaattt attgctagafc aaacgcacta catgcttttc agtagctgca ctaacaaaca 
14581 atgttgcttt tcaaactgtc aaacccggta attttaataa agacttttat gactttgctg 
14641 tgtctaaagg tttctttaag gaaggaagtt ctgttgaact aaaacacttc ttctttgctc 
14701 aggatggcaa cgctgctatc agtgattatg actattatcg ttataatctg ccaacaatgt 
147 61 gtgatatcag acaactccta ttcgtagttg aagttgttga taaatacttt gattgttacg 
14821 atggtggctg tattaatgcc aaccaagtaa tcgttaacaa tctggataaa tcagctggtt 
14881 tcccatttaa taaatggggt aaggctagac tttattatga ctcaatgagt tatgaggatc 
14941 aagatgcact tttcgcgtat actaagcgta atgtcatccc tactataact caaatgaatc 
15001 ttaagtatgc cattagtgca aagaatagag ctcgcaccgt agctggtgtc tctatctgta 
15061 gtactatgac aaatagacag tttcatcaga aattattgaa gtcaatagcc gccactagag 
15121 gagctactgt ggtaattgga acaagcaagt tttacggtgg ctggcataat atgttaaaaa 
15181 ctgtttacag tgatgtagaa actccacacc ttatgggttg ggattatcca aaatgtgaca 
15241 gagccatgcc taacatgctt aggataatgg cctctcttgt tcttgctcgc aaacataaca 
15301 cttgctgtaa cttatcacac cgtttctaca ggttagctaa cgagtgtgcg caagtattaa 
15361 gtgagatggt catgtgtggc ggctcactat atgttaaacc aggtggaaca tcatccggtg 
15421 atgctacaac tgcttatgct aatagtgtct ttaacatttg tcaagctgtt acagccaatg 
15481 haaatgcact tctttcaact gatggtaata agatagctga caagtatgtc cgcaatctac 
15541 aacacaggct ctatgagtgt ctctatagaa atagggatgt tgatcatgaa ttcgtggatg 
15601 agttttacgc ttacctgcgt aaacatttct ccatgatgat tctttctgat gatgccgttg 
15661 tgtgctataa cagtaactat gcggctcaag gtttagtagc tagcattaag aactttaagg 
15721 cagttcttta ttatcaaaat aatgtgttca tgtctgaggc aaaatgttgg actgagactg 
15781 accttactaa aggacctcac gaattttgct cacagcatac aatgctagtt aaacaaggag 
15841 atgattacgt gtacctgcct tacccagatc catcaagaat attaggcgca ggctgttttg 
15901 tcgatgatat tgtcaaaaca gatggtacac ttatgattga aaggttcgtg tcactggcta 
15961 ttgatgctta cccacttaca aaacatccta atcaggagta tgctgatgtc tttcacttgt 
16021 atttacaata cattagaaag ttacatgatg agcttactgg ccacatgttg gacatgtatt 
16081 ccgtaatgct aactaatgat aacacctcac ggtactggga acctgagttt tatgaggcta 
16141 tgtacacacc acatacagtc ttgcaggctg taggtgcttg tgtattgtgc aattcacaga 
16201 cttcacttcg ttgcggtgcc tgtattagga gaccattcct atgttgcaag tgctgctatg 
16261 accatgtcat ttcaacatca cacaaattag tgttgtctgt taatccctat gtttgcaatg 
16321 ccccaggttg tgatgtcact gatgtgacac aactgtatct aggaggtatg agctattatt 
16381 gcaagtcaca taagcctccc attagttttc cattatgtgc taatggtcag gtttttggtt 
16441 tatacaaaaa cacatgtgta ggcagtgaca atgtcactga cttcaatgcg atagcaacat 
16501 gtgattggac taatgcrggc gattacatac ttgccaacac ttgtactgag agactcaagc 
16561 ttttcgcagc agaaacgctc aaagccactg aggaaacatt taagctgtca tatggtattg 
16621 ccactgtacg cgaagtactc tctgacagag aattgcatct ttcatgggag gttggaaaac 
16681 ctagaccacc attgaacaga aactatgtct ttactggtta ccgtgtaact aaaaatagta 
16741 aagtacagat tggagagtac acctttgaaa aaggtgacta tggtgatgct gttgtgtaca 



FIGo 1© C©n ? tt 



WO 2004/085455 PCT/CN2004/000247 



15/94 

16801 gaggtactac gacatacaag ttgaatgttg gtgattactt tgtgttgaca tctcacactg 
16861 taatgccact tagtgcacct actctagtgc cacaagagca ctatgtgaga attactggct 
16921 tgtacccaac actcaacatc tcagatgagt tttctagcaa tgttgcaaat tatcaaaagg 
16981 tcggcatgca aaagtactct acactccaag gaccacctgg tactggtaag agtcattttg 
17041 ccatcggact tgctctctat tacccatctg ctcgcatagt gtatacggca tgctctcatg 
17101 cagctgttga tgccctatgt gaaaaggcat taaaatattt gcccatagat aaatgtagta 
17161 gaatcatacc tgcgcgtgcg cgcgtagagt gttttgataa attcaaagtg aattcaacac 
17221 tagaacagta tgttttctgc actgtaaatg cattgccaga aacaactgct gacattgtag 
17281 tctttgatga aatctctatg gctactaatt atgacttgag tgttgtcaat gctagacttc 
17341 gtgcaaaaca ctacgtctat attggcgatc ctgctcaatt accagccccc cgcacattgc 
17401 tgactaaagg cacactagaa ccagaatatt ttaattcagt gtgcagactt atgaaaacaa 
17461 taggtccaga catgttcctt ggaacttgtc gccgttgtcc tgctgaaatt gttgacactg 
17521 tgagtgcttt agtttatgac aataagctaa aagcacacaa ggataagtca gctcaatgct 
17581 tcaaaatgtt ctacaaaggt gttattacac atgatgtttc atctgcaatc aacagacctc 
17641 aaataggcgt tgtaagagaa tttcttacac gcaatcctgc ttggagaaaa gctgttttta 
17701 tctcacctta taattcacag aacgctgtag cttcaaaaat cttaggattg cctacgcaga 
17761 ctgttgattc atcacagggt tctgaatatg actatgtcat attcacacaa actactgaaa 
17821 cagcacactc ttgtaatgtc aaccgcttca atgtggctat cacaagggca aaaattggca 
17881 thttgtgcat aatgtctgat agagatcttt atgacaaact gcaatttaca agtctagaaa 
17941 taccacgtcg caatgtggct acattacaag cagaaaatgt aactggactt tttaaggact 
18001 gtagtaagat cattactggt cttcatccta cacaggcacc tacacacctc agcgttgata 
18061 taaaattcaa gactgaagga ttatgtgttg acataccagg cataccaaag gacatgacct 
18121 accgtagact catctctatg atgggtttca aaatgaatta ccaagtcaat ggttacccta 
18181 atatgtttat cacccgcgaa gaagctattc gtcacgttcg tgcgtggatt ggctttgatg 
18241 tagagggctg tcatgcaact agagatgctg tgggtactaa cctacctctc cagctaggat 
18301 tttctacagg tgttaactta gtagctgtac cgactggtta tgttgacact gaaaataaca 
18361 cagaattcac cagagttaat gcaaaacctc caccaggtga ccagtttaaa catcttatac 
18421 cactcatgta taaaggcttg ccctggaatg tagtgcgtat taagatagta caaatgctca 
184 81 gtgatacact gaaaggattg tcagacagag tcgtgttcgt cctttgggcg catggctttg 
18541 agcttacatc aatgaagtac tttgtcaaga ttggacctga aagaacgtgt tgtctgtgtg 
18601 acaaacgtgc aacttgcttt tctacttcat cagatactta tgcctgctgg aatcattctg 
18661 tgggttttga ctatgtctat aacccattta tgattgatgt tcagcagtgg ggctttacgg 
18721 gtaaccttca gagtaaccat gaccaacatt gccaggtaca tggaaatgca catgtggcta 
18781 gttgtgatgc tatcatgact agatgtttag cagtccatga gtgctttgtt aagcgcgttg 
18841 attggtctgt tgaataccct attataggag atgaactgag ggttaattct gcttgcagaa 
18901 aagtacaaca catggttgtg aagtctgcat tgcttgctga taagtttcca gttcttcatg 
18961 acattggaaa tccaaaggct atcaagtgtg tgcctcaggc tgaagtagaa tggaagttct 
19021 acgatgctca gccatgtagt gacaaagctt acaaaataga ggaactcttc tattcttatg 
19081 ctacacatca cgataaattc actgatggtg tttgtttgtt ttggaattgt aacgttgatc 
19141 gttacccagc caatgcaatt gtgtgtaggt ttgacacaag agtcttgtca aacttgaact 
19201 taccaggctg tgatggtggt agtttgtatg tgaataagca tgcattccac actccagctt 
19261 tcgataaaag tgcatttact aatttaaagc aattgccttt cttttactat tcfcgatagtc 
19321 cttgtgagtc tcatggcaaa caagtagtgt cggatattga ttatgttcca ctcaaatctg 
19381 ctacgtgtat tacacgatgc aatttaggtg gtgctgtttg cagacaccat gcaaatgagt 
194 41 accgacagta cttggatgca tataatatga tgatttctgc tggatttagc ctatggattt 
19501 acaaacaatt tgatacttat aacctgtgga atacafcttac caggttacag agtttagaaa 
19561 atgtggctta taatgttgtt aataaaggac actttgatgg acacgccggc gaagcacctg 
19621 tttccatcat taataatgct gtttacacaa aggtagatgg tattgatgtg gagatctttg 
19681 aaaataagac aacacttcct gttaatgttg catttgagct ttgggctaag cgtaacatta 
19741 aaccagtgcc agagattaag atactcaata atttgggtgt tgatatcgct gctaatactg 
19801 taatctggga ctacaaaaga gaagccccag cacatgtatc tacaataggt gtctgcacaa 
198 61 tgactgacat tgccaagaaa cctactgaga gtgcttgttc ttcacttact gtcttgtttg 
19921 atggtagagt ggaaggacag gtagaccttt ttagaaacgc ccgtaatggt gttttaataa 
19981 cagaaggttc agtcaaaggt ctaacacctt caaagggacc agcacaagct agcgtcaatg 
20041 gagtcacatt aattggagaa tcagtaaaaa cacagtttaa ctactttaag aaagtagacg 
20101 gcattattca acagttgcct gaaacctact ttactcagag cagagactta gaggatttta 
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20161 agcccagatc acaaatggaa actgactttc tcgagctcgc tatggatgaa ttcatacagc 
20221 gatataagct cgagggctat gccttcgaac acatcgttta tggagatttc agtcatggac 
20281 aacttggcgg tcttcattta atgataggct tagccaagcg ctcacaagat tcaccactta 
20341 aattagagga ttttatccct atggacagca cagtgaaaaa ttacttcata acagatgcgc 
20401 aaacaggttc atcaaaatgt gtgtgttctg tgattgatct tttacttgat gactttgtcg 
20461 agataataaa gtcacaagat ttgtcagtga tttcaaaagt ggtcaaggtt acaattgact 
20521 atgctgaaat ttcattcatg ctttggtgta aggatggaca tgttgaaacc ttctacccaa 
20581 aactacaagc aagtcaagcg tggcaaccag gtgttgcgat gcctaacttg tacaagatgc 
20641 aaagaatgct tcttgaaaag tgtgaccttc agaattatgg tgaaaatgct gttataccaa 
20701 aaggaataat gatgaatgtc gcaaagtata ctcaactgtg tcaatactta aatacactta 
20761 ctttagctgt accctacaac atgagagtta ttcactttgg tgctggctct gataaaggag 
20821 ttgcaccagg tacagctgtg ctcagacaat ggttgccaac tggcacacta cttgtcgatt 
20881 cagatcttaa tgacttcgtc tccgacgcag attctacttt aattggagac tgtgcaacag 
20941 tacatacggc taataaatgg gaccttatta ttagcgatat gtatgaccct aggaccaaac 
21001 atgtgacaaa agagaatgac tctaaagaag ggtttttcac ttatctgtgt ggatttataa 
21061 agcaaaaact agccctgggt ggttctatag ctgtaaagat aacagagcat tcttggaatg 
21121 ctgaccttta caagcttatg ggccatttct catggtggac agcttttgtt acaaatgtaa 
21181 abgcatcatc atcggaagca tttttaattg gggctaacta tcttggcaag ccgaaggaac 
21241 aaattgatgg ctataccatg catgctaact acattfctctg gaggaacaca aatcctatcc 
21301 agttgtcttc ctattcactc tttgacatga gcaaatttcc tcttaaatta agaggaactg 
21361 ctgtaatgtc tcttaaggag aatcaaatca atgatatgat ttattctctt ctggaaaaag 
21421 gtaggcttat cattagagaa aacaacagag ttgtggtttc aagtgatatt cttgttaaca 
21481 actaaacgaa catgtttatt ttcttattafc ttcttactct cactagtggt agtgaccttg 
21541 accggtgcac cacttttgat gatgttcaag ctcctaatta cactcaacat acttcatcta 
21601 tgaggggggt ttactatcct gatgaaattt ttagatcaga cactctttat ttaactcagg 
21661 atttatttct tccatrttat tctaatgtta cagggtttca tactattaat catacgtttg 
21721 gcaaccctgt catacctttt aaggatggta tttattttgc tgccacagag aaatcaaatg 
21781 ttgtccgtgg ttgggttttt ggttctacca tgaacaacaa gtcacagtcg gtgattatta 
21841 ttaacaattc tactaatgtt gttatacgag catgtaactt tgaattgtgt gacaaccctt 
21901 tctttgctgt ttctaaaccc atgggtacac agacacatac tatgatattc gataatgcat 
21961 ttaattgcac tttcgagtac atatctgatg ccttttcgct tgatgtttca gaaaagtcag 
22021 gtaattttaa acacttacga gagtttgtgt ttaaaaataa agatgggttt ctctatgttt 
22081 ataagggcta tcaacctata gatgtagttc gtgatctacc ttctggtttt aacactttga 
22141 aacctatttt taagttgcct cttggtatta acattacaaa ttttagagcc attcttacag 
22201 ccttttcacc tgctcaagac atttggggca cgtcagctgc agcctatttt gttggctatt 
22261 taaagccaac tacatttatg ctcaagtatg atgaaaatgg tacaatcaca gatgctgttg 
22321 attgttctca aaatccactt gctgaactca aatgctctgt taagagcttt gagattgaca 
22381 aaggaattta ccagacctct aatttcaggg ttgttccctc aggagatgtt gtgagattcc 
22441 ctaatattac aaacttgtgt ccttttggag aggtttttaa tgctactaaa ttcccttctg 
22501 tctatgcatg ggagagaaaa aaaatttcta attgtgttgc tgattactct gtgctctaca 
22561 actcaacatt tttttcaacc tttaagtgct atggcgtttc tgccactaag ttgaatgatc 
22621 tttgcttctc caatgtctat gcagattctt ttgtagtcaa gggagatgat gtaagacaaa 
22681 tagcgccagg acaaactggt gttattgctg attataatta taaattgcca gatgatttca 
22741 tgggttgtgt ccttgcttgg aatactagga acattgatgc tacttcaact ggtaattata 
22801 attataaata taggtatctt agacatggca agcttaggcc ctttgagaga gacatatcta 
22861 atgtgccttt ctcccctgat ggcaaacctt gcaccccacc tgctcfctaat tgttattggc 
22921 cattaaatga ttatggtttt tacaccacta ctggcattgg ctaccaacct tacagagttg 
22981 tagtactttc ttttgaactt ttaaatgcac cggccacggt ttgtggacca aaattatcca 
23041 ctgaccttat taagaaccag tgtgtcaatt ttaattttaa tggactcact ggtactggtg 
23101 tgttaactcc ttcttcaaag agatttcaac catttcaaca atttggccgt gatgtttctg 
23161 atttcactga ttccgttcga gatcctaaaa catctgaaat attagacatt tcaccttgct 
23221 cttttggggg tgtaagtgta attacacctg gaacaaatgc ttcatctgaa gttgctgttc 
23281 tatatcaaga tgttaactgc actgatgttt ctacagcaat tcatgcagat caactcacac 
23341 cagcttggcg catatattct actggaaaca atgtattcca gactcaagca ggctgtctta 
23401 taggagctga gcatgtcgac acttcttatg agtgcgacat tcctattgga gctggcattt 
23461 gtgctagtta ccatacagtt tctttattac gtagtactag ccaaaaatct attgtggctt 
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23581 ctactaactt ttcaatcagc attactacag aagtaatgcc tgtttctatg gctaaaacct 
23641 ccgtagattg taatatgtac atctgcggag attctactga atgtgctaat ttgcttctcc 
23701 aatatggtag cttttgcaca caactaaatc gtgcactctc aggtattgct gctgaacagg 
23761 atcgcaacac acgtgaagtg ttcgctcaag tcaaacaaat gtacaaaacc ccaactttga 
23821 aatattttgg tggttttaat ttttcacaaa tattacctga ccctctaaag ccaactaaga 
23881 ggtcttttat tgaggacttg ctctttaata aggtgacact cgctgatgct ggcttcatga 
23941 agcaatatgg cgaatgccta ggtgatatta atgctagaga tctcatttgt gcgcagaagt 
24 001 tcaatggact tacagtgttg ccacctctgc tcactgatga tatgattgct gcctacactg 
24061 ctgctctagt tagtggtact gccactgctg gatggacatt tggtgctggc gctgctcttc 
24121 aaataccttt tgctatgcaa atggcatata ggttcaatgg cattggagtt acccaaaatg 
24181 ttctctatga gaaccaaaaa caaatcgcca accaatttaa caaggcgatt agtcaaattc 
24241 aagaatcact tacaacaaca tcaactgcat tgggcaagct gcaagacgtt gttaaccaga 
24301 atgctcaagc attaaacaca cttgttaaac aacttagctc taattttggt gcaatttcaa 
24361 gtgtgctaaa tgatatcctt tcgcgacttg ataaagtcga ggcggaggta caaattgaca 
24421 ggttaattac aggcagactt caaagccttc aaacctatgt aacacaacaa ctaatcaggg 
24481 ctgctgaaat cagggcttct gctaatcttg ctgctactaa aatgtctigag tgtgttcttg 
24541 gacaatcaaa aagagttgac ttttgtggaa agggctacca ccttatgtcc ttcccacaag 

24 601 cagccccgca tggtgttgtc ttcctacatg tcacgtatgt gccatcccag gagaggaact 
24661 tcaccacagc gccagcaatt rgtcatgaag gcaaagcata cttccctcgt gaaggtgttt 
24721 ttgtgtttaa tggcacttct tggtttatta cacagaggaa cttcttttct ccacaaataa 
24781 ttactacaga caatacattt gtctcaggaa attgtgatgt cgttattggc atcattaaca 
24841 acacagttta tgatcctctg caacctgagc ttgactcatt caaagaagag ctggacaagt 
24901 acttcaaaaa tcatacatca ccagatgttg atcttggcga catttcaggc attaacgctt 
24961 ctgtcgtcaa cattcaaaaa gaaattgacc gcctcaatga ggtcgctaaa aatttaaatg 
25021 aatcactcat tgaccttcaa gaattgggaa aatatgagca atatattaaa tggccttggt 
25081 atgtttggct cggcttcatt gctggactaa ttgccatcgt catggttaca atcttgcttt 
25141 gttgcatgac tagttgtcgc agttgcctca agggtgcatg ctcttgtggt tcttgctgca 
25201 agtttgatga ggatgactct gagccagttc tcaagggtgt caaattacat tacacataaa 
252 61 cgaacttatg gatttgttta tgagattttt tactcttgga tcaattactg cacagccagt 
25321 aaaaattgac aatgcttctc ctgcaagtac tgttcatgct acagcaacga taccgctaca 
25381 agcctcactc cctttcggat ggcttgttat tggcgttgca tttcttgctg tttttcagag 
25441 cgctaccaaa ataattgcgc tcaataaaag atggcagcta gccctttata agggcttcca 
25501 gttcatttgc aatttactgc tgctatttgt taccatctat tcacatcttt tgcttgtcgc 
25561 tgcaggtaag gaggcgcaat ttttgtacct ctatgccttg atatattttc tacaatgcat 
25621 caacgcatgt agaattatta tgagatgttg gctttgttgg aagtgcaaat ccaagaaccc 

25 681 attactttat gatgccaact actttgtttg ctggcacaca cataactatg actactgtat 
25741 accatataac agtgtcacag atacaattgt cgttactgaa ggtgacggca tttcaacacc 
25801 aaaactcaaa gaagactacc aaattggtgg ttattctgag gataggcact caggtgttaa 
25861 agactatgtc gttgtacatg gctatttcac cgaagtttac taccagcttg agtctacaca 
25921 aattactaca gacactggta ttgaaaatgc tacatt.cttc atctttaaca agcttgttaa 
25981 agacccaccg aatgtgcaaa tacacacaat cgacggctct tcaggagttg ctaatccagc 
26041 aatggatcca atttatgatg agccgacgac gactactagc gtgcctttgt aagcacaaga 
26101 aagtgagtac gaacttatgt actcattcgt ttcggaagaa acaggtacgt taatagttaa 
26161 tagcgtactt ctttttcttg cttccgtggt attcttgcta gtcacactag ccatccttac 
26221 tgcgcttcga ttgtgtgcgt actgctgcaa tattgttaac gtgagtttag taaaaccaac 
26281 ggtttacgtc tactcgcgtg ttaaaaatct gaactcttct gaaggagttc ctgatcttct 
26341 ggtctaaacg aactaactat tattattatt ctgtttggaa ctttaacatt gcttatcatg 
26401 gcagacaacg gtactattac cgttgaggag cttaaacaac tcctggaaca atggaaccta 
26461 gtaataggtt tcctattcct agcctggatt atgfctactac aatttgccta ttctaatcgg 
26521 aacaggtttt tgtacataat aaagcttgtt ttcctctggc tcttgtggcc agtaacactt 
26581 gcttgttttg tgcttgctgt tgtctacaga attaattggg tgactggcgg gattgcgatt 
26641 gcaatggctt gtattgtagg cttgatgtgg cttagctact tcgttgcttc cttcaggctg 
26701 tttgctcgta cccgctcaat gtggtcattc aacccagaaa caaacattct tctcaatgtg 
26761 cctctccggg ggacaattgt gaccagaccg ctcatggaaa gtgaacttgt cattggtgct 
26821 gtgatcattc gtggtcactt gcgaatggcc ggacactccc tagggcgctg tgacattaag 
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2 6881 gacctgccaa aagagatcac tgtggctaca tcacgaacgc tttcttatta caaattagga 
26941 gcgtcgcagc gtgtaggcac tgattcaggt tttgctgcat acaaccgcta ccgtattgga 
27001 aactataaat taaatacaga ccacgccggt agcaacgaca atattgcttt gctagtacag 
27061 taagtgacaa cagatgtttc atcttgttga cttccaggtt acaatagcag agatattgat 
27121 tatcattatg aggactttca ggattgctat ttggaatctt gacgttataa taagttcaat 
27181 agtgagacaa ttatttaagc ctctaactaa gaagaattat tcggagttag atgatgaaga 
27241 acctatggag ttagattatc cataaaacga acatgaaaat tattctcttc ctgacattga 
27301 ttgtatttac atcttgcgag ctatatcact atcaggagtg tgttagaggt acgactgtac 
27361 tactaaaaga accttgccca tcaggaacat acgagggcaa ttcaccattt caccctcttg 
27421 ctgacaataa atttgcacta acttgcacta gcacaoactt tgcttttgct tgtgctgacg 
27481 gtactcgaca tacctatcag ctgcgtgcaa gatcagtttc accaaaactt ttcatcagac 
27541 aagaggaggt tcaacaagag ctctactcgc cactttttct cattgttgct gctctagtat 
27 601 ttttaatact ttgcttcacc attaagagaa agacagaatg aatgagctca ctttaattga 
27 661 cttctatttg tgctttttag cctttctgct attcciitgtt ttaataatgc ttattatatt 
27721 ttggttttca ctcgaaatcc aggatctaga agaaccttgt accaaagtct aaacgaacat 
27781 gaaacttctc attgttttga cttgtatttc tctatgcagt tgcatatgca ctgtagtaca 
27841 gcgctgtgca tctaataaac ctcatgtgct tgaagatcct tgtaaggtac aacactaggg 
27901 gtaatactta tagcactgct tggctttgtg ctctaggaaa ggttttacct tttcatagat 
27 961 ggcacactat ggttcaaaca tgcacaccta atgttactat caactgtcaa gatccagctg 
28021 gtggtgcgct tatagctagg tgttggtacc ttcatgaagg tcaccaaact gctgcattta 
28081 gagacgtact tgttgtttta aataaacgaa caaattaaaa tgtctgataa tggaccccaa 
28141 tcaaaccaac gfcagtgcccc ccgcattaca tttggtggac ccacagattc aactgacaat 
28201 aaccagaatg gaggacgcaa tggggcaagg ccaaaacagc gccgacccca aggtttaccc 
28261 aataatactg cgtcttggtt cacagctctc actcagcatg gcaaggagga acttagattc 
28321 cctcgaggcc agggcgttcc aatcaacacc aatagtggtc cagatgacca aattggctac 
28381 taccgaagag ctacccgacg agttcgtggt ggtgacggca aaatgaaaga gctcagcccc 
28441 agatggtact tctattacct aggaactggc ccagaagctt cacttcccha cggcgctaac 
28501 aaagaaggca tcgtatgggt tgcaactgag ggagccttga atacacccaa agaccacatt 
28561 ggcacccgca atcctaataa caatgctgcc accgtgctac aacttcctca aggaacaaca 
28621 ttgccaaaag gcttctacgc agagggaagc agaggcggca gtcaagcctc ttctcgctcc 
2 8 681 tcatcacgta gtcgcggtaa ttcaagaaat tcaactcctg gcagcagtag gggaaattct 
28741 cctgctcgaa tggctagcgg aggtggtgaa actgccctcg cgctattgct gctagacaga 
288 01 ttgaaccagc ttgagagcaa agtttctggt aaaggccaac aacaacaagg ccaaactgtc 
2 88 61 actaagaaat ctgctgctga ggcatctaaa aagcctcgcc aaaaacgtac tgccacaaaa 
2 8 921 cagtacaacg tcactcaagc atttgggaga cgtggtccag aacaaaccca aggaaatttc 
28981 ggggaccaag acctaatcag acaaggaact gattacaaac attggccgca aattgcacaa 
2 9041 tttgctccaa gtgcctctgc attctttgga atgtcacgca ttggcatgga agtcacacct 
2 9101 tcgggaacat ggctgactta tcatggagcc attaaattgg atgacaaaga tccacaattc 
29161 aaagacaacg tcatactgct gaacaagcac attgacgcat acaaaacatt cccaccaaca 
29221 gagcctaaaa aggacaaaaa gaaaaagact gatgaagctc agcctttgcc gcagagacaa 
2 9281 aagaagcagc ccactgtgac tcttcttcct gcggctgaca tggatgattt ctccagacaa 
2 9341 cttcaaaatt ccatgagtgg agcttctgct gattca.actc aggcataaac actcatgatg 
2 9401 accacacaag gcagatgggc tatgtaaacg ttttcgcaat tccgtttacg atacatagtc 
29461 tactcttgtg cagaatgaat tctcgtaact aaacagcaca agtaggttta gttaacttta 
29521 atctcacata gcaatcttta atcaatgtgt aacattaggg aggacttgaa agagccacca 
29581 cattttcatc gaggccacgc ggagtacgat cgagggtaca gtgaataatg ctagggagag 
29641 ctgcctatat ggaagagccc taatgtgtaa aattaatttt agtagtgcta tccccatgtg 
29701 attttaatag cttcttagga gaatgacaaa aaaaaaaaaa aa 
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1 - ATATTA^GTTTTTACCTACCCAGGAAAAGCCAACCAACCTCGATCTCTTGTAGATCTGTT - 60 
-IL.feFYLPRKSQPTSISCRSV 
-Y*;VFTYPGKANQPRSLVDLF 
IR-FLPTQEKPTNLDLli* ICS 
61 - CTCTAAACGAACTTTAAAMCTGTGTAGCTGTCGCTCGGCTGCATGCCTAGTGCACCTAC - 120 
-L*ffNFKICVAVARLHA*CTY 
-SK'RTLKSV*LSIiGCMPSAPT 
LNiEL*NLCSCRSAACIiVHLR 
121 - GCAGTA : TAAACAATAATAAATTTTACTGTCGTTGACAAGA2\ACGAGTAACTCGTCCCTCT - 180 

- A V ■* T I INFTVVDKKRVTRPS 

Q Y ■ K Q * *ILLSLTRNE*I*VPL 
SINNNKFYCR*QETSNSSLF 
181 - TCTGCAGACTGCTTACGGTTTCGTCCGTGTTGCAGTCGATCATCAGCATACCTAGGTTTC - 240 
-SA PCLRFRPCCSRS SAYLGF 

- LQ TAY GFVRVAV DHQH T * V S 

CRILLTVSSVLQSIISIPRFR 
241 - GTCCGGOTGTGACCGAAAGGTAAGATGGAGAGCCTTGTTCTTGGTGTCAACGAGAAAACA - 300 

- V R V * PKGKMES LVLGVNEKT 

- SGiCDRKVRWRALFLVSTRKH 

PG;VTBR*DG5PCSWCQRENT 

301 - cacgtccaactcagtttgcctgtccttcaggttagagacgtgctagtgcgtgggttgggg - 360 
-hv0lslpvlqvrdvlvrgfg 
-ts,nsvclsfrletc*cvasg 
rp:tqfacpsg*rrasa"wlrg 

3 61 - gactctgtggaagaggccctatcggaggcacgtgaacacctcaaaaatggcacttgtggt - 42 0 

-dsveealsearehlkngtcg 

- TL.WKR P YRRHVN T S KMALVV 

LCiGRGPIGGT* TPQKWHLWS 
421 - CTAGTAfeAGCTGGAAAAAGGCGTACTGCCCCAGCTTGAACAGCCCTATGTGTTCATTAAA - 480 
-LVfcLEKGV LPQLEQPYVFIK 
-**:SWKKAYCPSLNSPMCSLN 

SR:AGKRRTAPA* TALCVH* T 

4 81 - CGTTCTGATGCCTTAAGCACCAATCACGGCCACAAGGTCGTTGAGCTGGTTGCAGAAATG - 54 0 

-RSDALSTNHGHKVVELVAEM 
-VL : MP*APITATRSLSWLQK 1 W 
F*:CLKHQSRPQGR* AGCRNG 
541 - GACGGCATTCAGTACGGTCGTAGCGGTATAACACTGGGAGTACTCGTGCCACATGTGGGC - 600 
-DGIQY GRSGITLGVLVPHVG 

-ta-fstvvav*hweyschmwa 
rh|svrs*ryntgstratcgr 
601 - gaaacc(pcaattgcataccgcaatgttcttcttcgtaagaacggtaataagggagccggt - 660 
-etpiayrnvllrkngnkgag 
-kp : qlhtamfffvrtvirepv 
np:ncipqcsss*er**gsrw 
661 - ggtcat^gctatggcatcgatctaaagtcttatgacttaggtgacgagcttggcactgat - 720 
-ghsygidlksydlgdelgtd 
~viamasi*slmt*vtslali 
s*^lwhrskvl*lr*rawh*s 
721 - cccattoaagattatgaacaaaactggaacactaagcatggcagtggtgcactccgtgaa - 7 80 
-pi&dyeqnwntkhg sgalre 
-pl'kimnktgtlsmavvhsvn 
h * ; r l * tkleh*awqwctp*t 

781 - CTCACTCGTGAGCTCAATGGAGGTGCAGTCACTCGCTATGTCGACAACAATTTCTGTGGC - 84 0 
-LTftELNGGAVTRYVDNHFCG 
-SL-VSSMEVQSLAMSTTISVA 
HS.*AQWRCSHSLCRQQFLWP 
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841 - CCAGATGGGTACCCTCTTGATTGCATCAAAGATTTTCTCGCACGCGCGGGCAAGTCAATC - 900 
-PD GYPLDC IKDFLARAGKSM 

- QM-GTLLIASKI FSHARASQC 

RW- VPS *LHQRFSRTRGQVNV 
901 - TGCACT.CTTTCCGAACAACTTGATTACATCGAGTCGAAGAGAGGTGTCTACTGCTGCCGT - 960 
-CTIiSEQIjDYIESKRGVYCCR 

- AL FPNNL I T S S RREVS TAAV 

HS" : FRTT*LHRVEERCLLIiP* 
961 - GACCATbAGCATGAAATTGCCTGGTTCACTGAGCGCTCTGATAAGAGCTAGGAGCACCAG - 1020 
-DHEHEIAW F T E R S DKS YEHQ 

- TM'SMKLPG S LSAL I RAT STR 

p*;a*nclvh*al**elrapd 
1021 - acacccttcgaaattaagagtgccaagaaatttgacactttcaaaggggaatgcccaaag - 1080 
-tp.feiksakkfdtfkgecpk 

- hp:sklrv p rn lt ls kgnaq s 

tl;rn*ecqei * hfqrgmpkv 
1081 - tttgtgetttcctcttaactcaaaagtcaaagtcattcaaccacgtgttgaaaagaaaaag - 1140 
-fv fplnskvkvi qprvekkk 

- lc'flltqksksfnhvlkrkr 

cv;ss*lksqshsttc*kekd 
1141 - actgagisgtttcatggggcgtatacgctctgtgtaccctgttgcatctccacaggagtgt - 1200 
-tegfmgrirsvypvaspqec 
-lr j vswgvyalctllhlhrsv 
*g;fhgaytlcvpccistgv* 
1201 - aacaatatgcacttgtctaccttgatgaaatgtaatcattgcgatgaagtttcatggcag - 12 60 
-nnmhlstlmkcnhc d e v s w q 

-TljCTCLP* * N V I IAMKFHGR 
QY;ALVYLDEM*SLR*SFMAD 
1261 - ACGTGCGACTTTCTGAAAGCCACTTGTGAACATTGTGGCACTGAAAATTTAGTTATTGAA - 1320 
-TCDFLKATCEKCGTENLVIE 

- RA=T F * K P L V IVALKI * LLK 

VRiLSESHL*TLWH*KFSY*R 
1321 - GGACCTACTACATGTGGGTACCTACCTACTAATGCTGTAGTGAAAATGCCATGTCCTGCC - 13 80 
-GPTTCGYLPTNAVVKMPCPA 

- dl'lhvgtyllml* *KCHVLP 

TY-YMWVPTY* CCSENAMSCL 
1381 - TGTCAAGACCCAGAGATTGGACCTGAGCATAGTGTTGCAGATTATCACAACCACTCAAAC - 14 40 
-CQDPEIGPEHSVADYHNHSN 
-VKiTQRLDLSIVLQI ITTTQT 
SRSPRDWT*A*CCRI,SQPLKH 
1441 - ATTGAAACTCGACTCCGCAAGGGAGGTAGGACTAGATGTTTTGGAGGCTGTGTGTTTGCC ~ 1500 

-i et rlrkggrtrcfggcvfa 
-lk'.ldsarevgldvleavclp 
*n:stpqgr*d*mfwrlcvcl 
1501 - tatgttggctgctataataagcgtgcctactgggttcctcgtgctagtgctgatattggc - 1560 
-yvgcynkraywvprasadig 

- M L : A A I I S V PTG FLVLVLI IjA 

CW:LL**ACLL"GSSC*C*YWL 
1561 - TCAGGCGATACTGGCATTACTGGTGACAATGTGGAGACCTTGAATGAGGATCTCCTTGAG ~ 1620 

- S G ft T G ITGDWVETIjWEDLLE 

~qa : ilali»vtmwrp*mrislr 

RP.YWHYW*QCGDLE*GSP*D 
1621 - ATACTGAGTCGTGAACGTGTTAACATTAACATTGTTGGCGATTTTCATTTGAATGAAGAG - 1680 

- i lSrervninivgdfhlnee 

-Y*.VVNVLTLTLLAI FI * M K R 
TES*TC*H*HCWRFSFE*RG 
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1681 - GTTGCCATCATTTTGGCATCTTTCTCTGCTTCTACAAGTGCCTTTATTGACACTATAAAG - 1740 

- V A 'I ILASFSASTSAFIDTIK 
-LPSFWHLSLLLQVPLLTL*R 

CtfHFGIFLCFYKCLY* H Y K E 

17 41 - AGTCTTGATTACAAGTCTTTCAAAACCATTGTTGAGTCCTGCGGTAACTATAAAGTTACC - 1800 

-SLDYKSFKTIVBSCGNYKVT 

- V L . I T S LSKPLLS P A V T I K L P 

S*LQVFQNHC*VLR*L*SYQ 
1801 - AAGGGAAAGCCCGTAAAAGGTGCTTGGAACATTGGACAACAGAGATCAGTTTTAACACCA - 1860 
-KGK PVKGAWN I GQQRSVLTP 

- RESP*KVLGTLDNRDQF*HH 

gkarkrclehwtteis fntt 

18 61 - ctgtgtggttttccctcacaggctgctggtgttatcagatcaatttttgcgcgcacactt - 1920 

-lcp fpsqaagvirs i fartl 
c v v f phrllvls dq flrahl 
vw fsltgcwcyqi nfcaht* 
1921 - gatgcagcaaaccactcaattcctgatttgcaaagagcagctgtcaccatacttgatggt - 1980 
-daanhs ipdlqraavt ildg 
-mq'q ttqflickeq ls pylmv 
cs;kplns*faksschht*wy 
1981 - atttctgaacagtcattacgtcttgtcgacgccatggtttatacttcagacctgctcacg - 2040 
-i seqslrlvdamvyts dllt 
-fl ! nshyvlstpwfilqtcsp 

F *! T V ITSCRRHGLYFRPAHQ 
2041 - AACAGTGTCATTATTATGGCATATGTAACTGGTGGTCTTGTACAACAGACTTCTCAGTGG - 2100 

- N S V I I MAYVTGGLVQQTSQW 
-TV:SLLWHM*LVVLYNRLLSG 

QC'HYYGTCNWWS CTTDFSVV 
2101 - TTGTCTAATCTTTTGGGCACTACTGTTGAAAAACTCAGGCCTATCTTTGAATGGATTGAG - 2160 
-LSNLLGTTVEKLRP I FEW IE 

- C L I FWALLLKNSGLSLNGLR 

V* ! SFGHYC*KTQAYL*MD*G 

2161 - gcgaaacttagtgcaggagttgaatttctcaaggatgcttgggagattctcaaatttctc - 2220 
-aklsagveflkdaweilkfl 
-rn;lvqelnfsrmlgrfsnfs 

ET:*CRS*ISQGCLGDSQISH 
2221 - ATTACA6GTGTTTTTGACATCGTCAAGGGTCAAATACAGGTTGCTTCAGATAACATCAAG - 228 0 
-IT<jjVFDIVKGQIQVAS dnik 
-LQ;VFLTSSRVKYRLLQITSR 

yr:cf + hrqgsntgcfr*hqg 

2281 - GATTGTCPTAAAATGCTTCATTGATGTTGTTAACAAGGCACTCGAAATGTGCATTGATCAA - 234 0 
~DCVKCFIDVVNKAL EMCIDQ 

- I V:* N AS LMLLT RH S KCAL I K 

lckmlh*cc*qgtrnvh*ss 
2341 - gtcactatcgctggcgcaaagttgcgatcactcaacttaggtgaagtcttcatcgctcaa - 2400 
-vtiagaklrslnlgevfiaq 
-sl;slaqscdhst*vkssslk 
hy:rwrkvaitqlr*slhrsk 
2401 - agcaagqgactttaccgtcagtgtatacgtggcaaggagcagctgcaactactcatgcct - 24 60 
-skglyrqcirgkeqlqllmp 

- ar;dftvsvyvarsscnyscl 

q g; t l p svytw qgaaat thas 
2461 - cttaagocaccaaaagaagtaacctttcttgaaggtgattcacatgacacagtacttacc - 2520 

- L K ft PKEVTFLEG DS H DTVLT 

LR HQ K K * PFLKV IHMTQYLP 
*G TKRSNLS*R*FT*HSTYL 
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2521 - t c t g aqg ag gt t g tt ct c aag aac gg tg aact cg aagc act cg ag acgcccgt t gat agc - 2 580 
-seievvlkngelealetpvds 
lr rl f srtv n skh 5 rr pl ia 
*qgcsqer*trstrdar**l 
2581 - ttcacaaatggagctatcgtcggcacaccagtctgtgtaaatggcctcatgctct'tagag - 2640 
-ft ngaivgtpvcvnglmlle 
-sq : melssahqsv*mascs*r 
hkwsyrrhtslckwphalrd 
2641 - attaagjgacaaagaacaatactgcgcattgtctcctggtttactggctacaaacaatgtc - 2700 
-ikpkeqycals pgllatnnv 
~ lr tknntah cllvyw lqtms 
*g, qrtilrivswftgykqcl 
2701 - tttcgcttaaaagggggtgcaccaattaaaggtgtaacctttggagaagatactgtttgg - 27 60 
-frlkggap i kgvtfge d t v w 
fa* kgvhqlkv* pleki lfg 
s l : k r g c t n * rcnlwrryclg 
27 61 - gaagttcaaggttacaagaatgtgagaatcacatttgagcttgatgaacgtgttgacaaa - 2820 
-evqgyknvrit fe1dervdk 
-kf : ;kvtrm*bshi»slmnvltk 
ss;rlqecenhi*a**tc*qs 
2821 - gtgcttaatgaaaagtgctctgtctacactgttgaatccggtaccgaagttactgagttt - 2880 
-vlnekcsvytvesgtevtef 

- cl mks a l s t l 1> n pv p kllsl 

a*:*kvlclhc*iryrsy*vc 
2881 - gcatgtgttgtagcagaggctgttgtgaagactttacaaccagtttctgatctccttacc - 2940 
-acvvaeavvktlqpvs dllt 

- r v l * q r l l * rlynqfli slp 

mcscsrgccedfttsf* spyq 
2941 - aacatgggtattgatcttgatgagtggagtgtagctacattctacttatttgatgatgct - 3000 

- N M <h I DLDEWSVATFYLFDDA 
-TW-VLILMSGV* LHSTYLMML 

HG Y*S* *VECSYILLI**CW 
3001 - GGTGAAGAAAACTTTTCATCACGTATGTATTGTTCCTTTTACCCTCCAGATGAGGAAGAA - 3060 
FS SRMYCS FYPP DEEE 
~VK ; KT FHHVCIVPFTIjQMRKK 
* R:KL F I TYVLFLLPS R * GRR 
3061 - GAGGACGATGCAGAGTGTGAGGAAGAAGAAATTGATGAAACCTGTGAACATCAGTACGGT - 3120 
-EDpAECEEEEIDETCEHEYG 
-RT^MQSVRKKKLMKPVNMSTV 
GR;CRV * GRRN * * N L * T * VRY 
3121 - ACAGAGCFATGATTATCAAGGTCTCCCTCTGGAATTTGGTGCCTCAGCTGAAACAGTTCGA - 3180 
-TEODYQGLPLEFGASAETVR 
Q R : M I IKVSLWNLVPQLKQFE 

rg;*lsrspsgiwcls*nsss 
3181 - gttgaggaagaagaagaggaagactggctggatgatactactgagcaatcagagattgag - 324 0 
-veeeeeedwiiddtteqseie 
-lr'kkkrktgwmillsnqrls 
*g:rrrgrlag*yy*aird*a 
3241 - ccagaaccagaacctacacctgaagaaccagttaatcagtttactggttatttaaaactt - 3300 
-pepept peepvnqftgylkl 
-qn;qnlhlknqlisllvi*nl 
rt rtyt * rts * svywlfkty 
3301 - actgacaatgttgccattaaatgtgttgacatcgttaaggaggcacaaagtgctaatcct - 33 60 
-tdnvai kcvdivkeaqsanp 

- lt'mlplnvltslrrhkvlil 

*q'cch*mc*hr*ggtkc*sy 
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3361 - ATGGTGftTTGTAAATGCTGCTAACATACACCTGAAACATGGTGGTGGTGTAGCAGGTGCA - 3420 
-MVIVNAAN I HLKHGGGVAGA 
-W*L*MLLTYT*NMVVV*QVH 
GDCKCC*HTPETWWWCSRCT 

3421 - C TC AACAAGGC AACC AATGG TGCC ATGCAA AAGG AG AGT GATG ATT AC AT T AAGCT AAAT - 3480 
-LNKATNGAMQKES3DYIKLN 

- S T R Q PMVPCKRRVMI TLS * M 

Q Q. GNQWCHAKGE**I>H*AKW 
34 81 - GGCCCTCTTAGAGTAGGAGGGTCTTGTTTGCTTTCTGGACATAATCTTGCTAAGAAGTGT - 3540 
-GPiLTVGGS CLLSGHNLAKKC 
-AL'LQ* EGLVCFLDI I L I* R S V 
PS : YSRRVLFAFWT* S C * E V S 
3541 - CTGCATGTTGTTGGACCTAACCTAAATGCAGGTGAGGACATCCAGCTTCTTAAGGCAGCA - 3600 
-LHVVGPNLNAGEDXQLLKAA 
-CM LL DLT * MQVRTSS FLRQH 
A O C W T * PKCR* GHPAS * G S I 
3601 - TATGAAAATTTCAATTCACAGGACATCTTACTTGCACCATTGTTGTCAGCAGGCATATTT - 3660 
-YENFNSQD I LLAPLLSAGI F 

- M K : I S IHRTSYLHHCCQQAYL 

*K : . FQFTGHLTCT IVVSRHIW 
3661 - GGTGCTAAACCACTTCAGTGTTTACAAGTGTGCGTGCAGACGGTTCGTACACAGGTTTAT - 3720 
-GAKPIiQS LQVCVQTVRTQVY 
-VL NHFSLYKCACRRFVHRFI 
C*:TTSVFTSVRADGSYTGLY 
3721 - ATTGCAGTCAATGACAAAGCTCTTTATGAGCAGGTTGTCATGGATTATCTTGATAACCTG - 3780 

- I A V N DKALYEQVVM DYLDNL 
-LQiSMTKLPMSRLSWI I L I T * 

CsiQ*QSSL*AGCHGLS**PE 
3781 - AAGCCT&G AG T G G AAGC AC CT AAAC AAG AG GAG C CACC AAAC AC AGAA GAT T CCAA AAC T - 38 40 
-KPRVEAPKQ E E P PNTE DSKT 

- s l'ew KHLN KR SHQTQKI PKL 

a*:sgst*trgatkhrrfqn* 
3841 - gaggagaaatctgtcgtacagaagcctgtcgatgtgaagccaaaaattaaggcctgcatt - 3900 
-eeksvvqkpvdvkpkikaci 
-rr-. mlsyrslsm* sqklrpal 
ge;icrteacrceakn*glh* 
3901 - gatgaggttaccacaacactggaagaaactaagtttcttaccaataagttactcttgttt - 3960 
-devtttleetkfltnklllf 
-mr|lpqhwkkls flpi syscl 
* g'yhntgrn* v s y q * vtlvc 
3961 - gctgat4tcaatggtaagctttaccatgattctcagaacatgcttagaggtgaagatatg - 4 02 0 
-adIngkly hdsqnk lrgedm 
-li-smvsftmilrtclevkic 
*y:qw*alp*fseha*r*ryv 
4 021 - tctttcgttgagaaggatgcaccttacatggtaggtgatgttatcactagtggtgatatc - 4 080 

-SFLEKDAPYMVGDVITSGDI 

- LS'LRRMHLTW*VMLSLVV I S 

FP:*EGCTLHGR*CYH*W*YH 
4 081 - ACTTGTGTTGTAATACCCTCCAAAAAGGCTGGTGGCACTACTGAGATGCTCTCAAGAGCT - 414 0 
-TCVVIPSKKAGGTTEMLSRA 
-LV''L*YPPKRIiVALLRCSQEI> 
L C ! C N TLQKGWWHY * DALKSF 
4141 - TTGAAGAAAGTGCCAGTTGATGAGTATATAACCACGTACCCTGGACAAGGATGTGCTGGT - 42 00 
-LKKVPVDEY ITTYPGQGCAG 

- * R^KCQLMS I * PRTLDKDVLV 

EE SAS**VYNHVPWTRMCWL 
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4201 - TATACACTTGAGGAAGCTAAGACTGCTCTTAAGAAATGCAAATCTGCATTTTATGTACTA - 4260 
-YTLEEAKTALKKCKSAFYVL 
IH : LRKLRLLLrRNANLHFMYY 
YT;*GS*DCS*EMQICILCTT 
42 61 - CCTTCAGAAGCACCTAATGCTAAGGAAGAGATTCTAGGAACTGTATCCTGGAATTTGAGA - 4320 
-PSEAPNAKEEILGTVSWNLR 

- LQ KHLMLRKR F * ELY PG I * E 

FR"ST*C*GRDSRNCILEFER 
4321 - GAAATGCTTGCTCATGCTGAAGAGACAAGAAAATTAATGCCTATATGCATGGATGTTAGA - 4380 
-EMJjAHAEE T RKLM? I CMDVR 

- KC'LLMLKRQBN * CLYAWMLE 

N A. CSC*RDKKINAYMHGC*S 
4 3 81 - GCCATaKtGGCAACCATCCAACGTAAGTATAAAGGAATTAAAATTCAAGAGGGCATCGTT - 44 40 
-AIMATIQRKYKGIKIQEGIV 

- P * \ W Q P SNVS I KELKFKRASL 

HN;GNHPT*V*RN*NSRGHR* 

44 41 - GACTATbGTGTCCGATTCTTCTTTTATACTAGTAAAGAGCCTGTAGCTTCTATTATTACG - 4500 

-DYGVRFFFYTSKEPVASI IT 
-TM VSDSSFILVKSL*LLLLR 
LW ; CPILLLY**RACSFYYYE 
4501 - AAGCTGAACTCTCTAAATGAGCCGCTTGTCACAATGCCAATTGGTTATGTGACACATGGT - 4 5 60 
-KLHSL.NEPIiVTMPIGYVTHG 
-S*TL*MSRLSQCQLVM*HMV 
AE'LSK*AACHNANWLCDTWF 

45 61 - TTTAATCTTGAAGAGGCTGCGCGCTGTATGCGTTCTCTTAAAGCTCCTGCCGTAGTGTCA - 4 620 

-FNkEEAARCMRSLKAPAVVS 
L IjLKRLRAV CVLLKLLP * CQ 
*S;*RGCALYAFS*SSCRSVS 
4 621 - GTATCATCACCAGATGCTGTTACTACATATAATGGATACCTCACTTCGTCATCAAAGACA - 4 680 
-VSS'PDAVTTYNGYLTSSSKT 
-YH:HQMLLLH IMDTSLRHQRfl 

ii.trccyyi*wiphfvikdi 
4 681 - tctgaggagcactttgtagaaacagtttctttggctggctcttacagagattggtcctat - 474 0 
-se&hfvetvslagsyrdwsy 
-lr:stl*kqflwlalteigpi 
* gjalcrns ffgwllqrlvlf 
47 41 - tcaggaiagcgtacagagttaggtgttgaatttcttaagcgtggtgacaaaattgtgtac - 4800 
-sgqrtelgve flkrgdkivy 
-qd. : svqs*vlnflsvvtklct 
rt;ayrvrc*is*aw*qncvp 
4801 - cacact6tggagagccccgtcgagtttcatcttgacggtgaggttctttcacttgacaaa - 4 860 
-htiespvefhldgevlsldk 
-tl;wrapss filtvrffhltn 
hsgeprrvss*r*gsft*qt 
4 861 - ctaaagagtctcttatccctgcgggaggxtaagactataaaagtgttcacaactgtggac - 4 92 0 
-lksllslrevkt ikvfttvd 
-*rvsypcgrlrl*kcsqlwt 
kejslipagg* dyksvhncgq 

4 921 - AACAGTAATCTCCACACACAGCTTGTGGATATGTCTATGACATATGGACAGCAGTTTGGT - 4 980 
-NTtiLHTQLVDMSMTYGQQFG 
-TL;ISTHSLWICL*HMDS5LV 
H* ; SPHTACGYVYDIWTAVWS 
4981 - CCAACA^ACTTGGATGGTGCTGATGTTACAAAAATTAAACCTCATGTAAATCATGAGGGT - 5040 
-PTYLDGADVTKIKPHVNHEG 
-QHTWMVLMLQKLNLM* IMRV 
NI.LGWC*CYKN*TSCKS*G* 
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5041 - AAGACOlTTCTTTGTACTACCTAGTGATGACACACTACGTAGTGAAGCTTTCGAGTACTAC - 5100 
-KTFFVLPS DDTLRSEAFEYY 
-RLSLYYLVMTHYVVKLSSTT 
DF ; LCTT* * *HTT* * S F R V L P 
5101 - CATACTCTTGATGAGAGTTTTCTTGGTAGGTACATGTCTGCTTTAAACCACACAAAGAAA - 5160 
-HTtDESFIiGRYMSALNHTKK 
-ILLMRVFLVGTCI ) L*TTQRN 
Y S * *EFSW*VHVCFKPHKEM 
5161 - TGGAAATTTCCTCAAGTTGGTGGTTTAACTTCAATTAAATGGGCTGATAACAATTGTTAT - 5220 
-WKFPQVGG.LTS IKWADNNCY 
-GN:FLKLVV* LQLNGLITIVI 
EI-SSSWWFNFN*MG**QLLF 
5221 - TTGTCTAGTGTTTTATTAGCACTTCAACAGCTTGAAGTCAAATTCAATGCACCAGCACTT - 52 BO 

- L S SVLLALQQLEVKFNAPAL 

- CLV FY * H FN S LKS NSMHQHF 

V*. CFISTSTA*SQIQCTSTS 
5281 - CAAGAG(3CTTATTATAGAGCCCGTGCTGGTGATGCTGCTAACTTTTGTGCACTCATACTC - 5340 
-QEAYYRARAGDAANFCALIL 

- K R L I IEPVLVMLLTFVHSYS 

RG-LL*SPCW*CC*LLCTHTR 
5341 - GCTTACAGTAATAAAACTGTTGGCGAGCTTGGTGATGTCAGAGAAACTATGACCCATCTT - 54 00 
-AYSNKTVGELGDVRETMTHL 

- lt v i kllas lvmsekl* p if 

lq:**ncwraw*cqrnydpss 
5401 - ctacagcatgctaatttggaatctgcaaagcgagttcttaatgtggtgtgtaaacattgt - 54 60 
-lqhanlesakrv lnvvckhc 
~ys;mi»iwnlqseflmwcvniv 
ta:c*fgickass*cgv*tlw 
5461 - ggtcagaaaactactaccttaacgggtgtagaagctgtgatgtatatgggtactctatct - 5520 
-gqktttltgveavmymgtls 

-VR:KLLP*RV*KIj* CIWVLYL 
SEjNYYLNGCRSCDVYGYSIL 
5521 - TATGATAATCTTAAGACAGGTGTTTCCATTCCATGTGTGTGTGGTCGTGATGCTACACAA - 5580 
-YDSILKTGVS I PCVCGRDATQ 
-MI ILRQVFP FHVCVVVMLHN 
**!S*DRCFHSMCVWS*CYTI 
5581 - TATCTAQTACAACAAGAGTCTTCTTTTGTTATGATGTCTGCACCACCTGCTGAGTATAAA - 5640 
-YL\fQQESSFVMMSAPPAEYK 

- I * i YNKSLLLL*CLHHLLSIN 

SS:TTRVFFCYDVCTTC*V*I 
5641 - TTACAGGAAGGTACATTCTTATGTGCGAATGAGTACACTGGTAACTATCAGTGTGGTCAT - 5700 
-LQQGTFLCANEY T GNYQCGH 
-YS KVHSYVRMSTLVTISVVI 
TA'RYILMCE*VHW *LSVWSL 
5701 - TACACTCATATAACTGCTAAGGAGACCCTCTATCGTATTGACGGAGCTCACCTTACAAAG - 5760 

- Y T HI TAKETLYR I DGAHLTK 
-TL'I*LLRRPSIVLTELTLQR 

H S YNC* GDPLS Y * RS S PYKD 
57 61 ~ ATGTCAGAGTACAAAGGACCAGTGACTGATGTTTTCTACAAGGAAACATCTTACACTACA - 5820 
-MSEYKGPVT DVFYKETSYTT 
-CQ.STKDQ*LMFSTRKHLTLQ 
VR VQRTSD * C F L Q G N I LHYN 
5821 - ACCATCKaGCCTGTGTCGTATAAACTCGATGGAGTTACTTACACAGAGATTGAACCAAAA - 5880 

- T I KPVSYKLDGVTYTEIEPK 

PS-SLCRINSMELLTQRLNQN 
HQACVV*TRWSYLHRD*TKI 
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58 81 - TTGGATiGGGTATTATAAAAAGGATAATGCTTACTATACAGAGCAGCCTATAGACCTTGTA - 5940 

-ldgyykkdnayyteqpidlv 
w m g i ikrimltiqssl* tly 
gw. vl*kg*ciilyraayrpct 

5941 - CCAACT'CAACCATTACCAAATGCGAGTTTTGATAATTTCAAACTCACATGTTCTAACACA - 6000 
-PTQPLPNASFDNFKLTCSNT 
-QL.NHYQMRVLII SNSHVLTQ 
NS. TITKCEF**FQTHMF*HK 

6001 - AAATTT'GCTGATGATTTAAATCAAATGACAGGCTTCACAAAGCCAGCTTCACGAGAGCTA - 6060 
-KFAD DLNQMTGFTKPASREL 

- N L : L M I * IK*QASQSQLHESY 

I C{ * * FKSN DRLHKAS FTRAI 
6061 - TCTGTCACATTCTTCCCAGACTTGAATGGCGATGTAGTGGCTATTGACTAT21GACACTAT - 6120 
-SVTF FP DLNGDVVAI DYRHY 

- LS'H S SQT * M A M * WLLT I DT I 

CH:ILPRLEWRCSGY*L*TLF 
6121 - TCAGCGAGTTTCAAGAAAGGTGCTAAATTACTGCATAAGCCAATTGTTTGGCACATTAAC - 6180 
-SASFKKGAKLLHKPIVWHIN 
-QR.VSRKVLNYCI SQLFGTLT 
SEiFQERC*ITA*ANCLAH*P 
6181 - CAGGCtKcAACCAAGACAACGTTCAAACCAAACACTTGGTGTTTACGTTGTCTTTGGAGT - 6240 
-QATT KTTFKPNTWCLRCIiWS 
-RIisQPRQRSNQTLGVYVVFGV 
GYiNQDNVQTKHLVFTLSLEY 
62 41 - ACAAAGCCAGTAGATACTTCAAATTCATTTGAAGTTGTGGCAGTAGAAGACACACAAGGA - 6300 
-TKPVD.TSNSFEVLAVEDTQG 

- Q S ]Q * ILQIHLKFWQ* KTHKE 

KA;SRYFKFI *SSGSRRHTRN 
6301 - ATGGACAATCTTGCTTGTGAAAGTCAACAACCCACCTCTGAAGAAGTAGTGGAAAATCCT - 6360 
-MDNLACESQQPTSEEVVENP 
-WT;IIiLVKVNNPPLKK* WKIL 
GQiSCL*KSTTHL*RSSGKSY 
6361 - ACCATACAGAAGGAAGTCATAGAGTGTGACGTGAAAACTACCGAAGTTGTAGGCAATGTC - 6420 

- T I 0KEVI ECDVKTTEVVGNV 
-PY ;RRKS *SVT*KLPKL*AMS 

HTjEGSHRV*RENYRSCRQCH 
6421 - ATACTTAAACCATCAGATGAAGGTGTTAAAGTAACACAAGAGTTAGGTCATGAGGATCTT - 648 0 

-i l $ p s degvkvt qelghedl 

- yl'nhqmkvlk*hks*vmril 

t*;tir*rc*sntrvrs*gsy 
6481 - atggctgcttatgtggaaaacacaagcattaccattaagaaacctaatgagctttcacta - 6540 
-maayventsiti kkpnelsl 

-WL -LMWKTQALPLRNIjMSFH* 

gc;lcgkhkhyh*et**afts 

6541 - GCCTTAGGTTTAAAAACAATTGCCACTCATGGTATTGCTGCAATTAATAGTGTTCCTTGG - 6600 
-ALQLKTIATHGIAAI nsvpw 
-P* V*KQLPLMVLLQLIVFLG 
LR.FKNNCHSWYCCN**CSLE 

6601 - AGTAAAATTTTGGCTTATGTCAAACCATTCTTAGGACAAGCAGCAATTACAACATCAAAT - 6660 

- S K J LAYVKPFLGQAAI TTSN 
~VK : FWLMSNHS*DKQQLQHQI 

*N FGLCQTILRTSSNYNIKL 
66 61 - TGCGCTAAGAGATTAGCACAACGTGTGTTTAACAATTATATGCCTTATGTGTTTACATTA - 6720 
-CA^RLAQRVFNNYMP YVFTL 
-AL'RD*HNVCLTI ICLMCLHY 
R*.EISTTCV*QLYALCVYII 
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6721 ~ TTGTTCCAATTGTGTACTTTTACTAAAAGTACCAATTCTAGAATTAGAGCTTCACTACCT ~ 6780 
-LFQLCTFTKSTNSRIRASLP 
-CS NCVLLLKVPI LELELHYL 
VFIVYFY*KYQF*N*SFTTY 

6781 - ACAACTATTGCTAAAAATAGTGTTAAGAGTGTTGCTAAATTATGTTTGGATGCCGGCATI - 68 40 

- T T k A K N SVKSVAKLCL DAGI 
-QLLLKIVLRVLLNYVWMPAL 

NY C*K*C*ECC*IMFGCRH* 
6841 - AATTATGTGAAGTCACCCAAATTTTCTAAATTGTTCACAATCGCTATGTGGCTATTGTTG - 6900 
-NYVKSPKFSKLFTXAMWLLL 

- i m * shpnflncsqslcgycc 

lc'evtqif*ivhnryvaivv 

6901 - TTAAGTATTTGCTTAGGTTCTCTAATCTGTGTAACTGCTGCTTTTGGTGTACTCTTATCT - 6960 
-LSICLGSLI CVTAAFGVLLS 
-*V;FA*VL*SV*LLLLVYSYL 
KYLLRFSNLCNCCFWCTLI* 

6961 - AATTTTGGTGCTCCTTCTTATTGTAATGGCGTTAGAGAATTGTATCTTAATTCGTCTAAC - 7020 
-NFGAPSYCNGVRELYLNSSN 

- IL, VLLLIVMALENCILIRLT 

FW : CSFLL*WR*RIVS*FV*R 
7021 - GTTACTACTATGGATTTCTGTGAAGGTTCTTTTCCTTGCAGCATTTGTTTA7VGTGGATTA - 7080 
-VTTMDFCEGSFPCSICIiSGL 
~LL;LW I SV KVLFLAAFV* V D * 

y y; y g f l * rffslqhlfkwir 
7081 - gactccfcttgattcttatccagctcttgaaaccattcaggtgacgatttcatcgtacaag - 7140 
-dsldsypaletiqvtissyk 
-tp:liliqllkpfr*rfhrts 

LP;*FLSSS*NHSGDDFIVQA 
7141 - CTAGACtTGACAATTTTAGGTCTGGCCGCTGAGTGGGTTTTGGCATATATGTTGTTCACA - 7200 
-LDLT I LGLAAEWVLAYML FT 
-*T"*QF*VWPLSGFWHICCSQ 
RL : DN FRSGR*VGFGI YVVHK 
7201 - AAATTCTTTTATTTATTAGGTCTTTCAGCTATAATGCAGGTGTTCTTTGGCTATTTTGCT - 72 60 
-KFFYLLGLSAIMQVFFGYFA 
-NS'FIY*VFQL*CRCSLAILL 
il|i»firs FSYNAGVLWLFC* 
7261 - AGTCATtTCATCAGCAATTCTTGGCTCATGTGGTTTATCATTAGTATTGTACAAATGGCA - 7320 
-SHflSNSWLMWFIISIVQMA 

-vilssailgscglslvlykwh 
sf hqqflahvvyh* yctngt 
7 321 - cccgtttctgcaatggttaggatgtacatcttctttgcttctttctactacatatggaag - 7380 
-pv$amvrmy iffasfyyi wk 
-pf^lqwlgctsslllsttygr 
rf^cng^dvhllcffllhmee 
7 381 - agctatgttcatatcatggatggttgcacctcttcgacttgcatgatgtgctataagcgc - 7 44 0 
-syvhimdgctsstcmmcykr 
-am;fiswmvaplrla*caisa 
lc-syhgwlhlfdlhdvl^aq 
7 441 - aatcgtqccacacgcgttgagtgtacaactattgttaatggcatgaagagatctttctat - 7500 
-nratrvecttivngmkrsfy 
-iv phalsvqlllma*rdlsm 
sc:htr*vynyc*wheeiflc 
7501 - gtctatgcaaatggaggccgtggcttctgcaagactcacaattggaattgtctcaattgt - 7560 
-vyanggrg fckthkwnclnc 
-sm ; qmeavasarltigivsiv 
lc'kwrpwllqdsqlelsql* 
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75 61 - GACACAbjTTTGCACTGGTAGTACATTCATTAGTGATGAAGTTGCTCGTGATTTGTCACTC - 7 620 
-DTFCTGST FI SDEVARDLSL 

- THjFALVVHSLVMKLLVI CHS 

HI;LHW*YIH***SCS*FVTP 
7621 - CAGTTT^AAAGACCAATCAACCCTACTGACCAGTCATCGTATATTGTTGATAGTGTTGCT - 7680 
-QFKRPINPTDQSSYIVDSVA 
SLjKDQSTLLTSHRILIiIVLL 
V* i KTNQPY*PVIVYC**CCC 
7681 - GTGAAAAATGGCGCGCTTCACCTCTACTTTGACAAGGCTGGTCAAAAGACC1ATGAGAGA - 7740 
-VKNGALHLY FDKAGQKTYER 

- * K'MARFTS TLTRLVKRPMRD 

EKiWRASPLL*QGWSKDL*ET 
7741 - CATCCGCTCTCCCATTTTGTCAATTTAGACAATTTGAGAGCTAACAACACTAAAGGTTCA - 7800 
-HPLSHFVNLDNLRANNTKGS 
-IR=SPILSI*TI*ELTTLKVH 
SA^LPFCQFRQFES* Q H ^ RFT 
7801 - CTGCCTATTAATGTCATAGTTTTTGATGGCAAGTCCAAATGCGACGAGTCTGCTTCTAAG - 78 60 
-LPINVIVFDGKS KCDESASK 
-CLiLMS^FLMASPNATSLLLS 
AY;*CHSF*WQVQMRRVCF*V 
78 61 - TCTGCTTCTGTGTACTACAGTCAGCTGATGTGCCAACCTATTCTGTTGCTTGACCAAGCT - 7920 
-SA^VYYSQLMCQPILLLDQA 
" LL;LCTTVS *CANLFCCLTKL 
CFjCVLQSADVPTYSVA*PSS 
7921 - CTTGTATCAAACGTTGGAGATAGTACTGAAGTTTCCGTTAAGATGTTTGATGCTTATGTC - 7980 
-LV5NVGDS TEVSVKMFDAYV 
-LYIQTLEIVLKFPLRCLMLMS 
CIjKRWR*Y*SFR*DV*CLCR 
7 981 - GACACCTTTTCAGCAACTTTTAGTGTTCCTATGGAAAAACTTAAGGCACTTGTTGCTACA - 804 0 
-DTFSATFSVPMEKLKALVAT 

- T PiFQQLLV FLWKNLRHLLLQ 

HLiFSNF* CSYGKT*GTCCYS 
8041 - GCTCAciGCGAGTTAGCAAAGGGTGTAGCTTTAGATGGTGTCCTTTCTACATTCGTGTCA - 8100 
-AHSELAKGVALDGVL'S tfvs 

-lt!as*qrv*l*mvsflhscq 
sqirvskgcsfrwcpfyirvs 
8101 - gctgcccgacaaggtgttgttgataccgatgttgacacaaaggatgttattgaatgtctc - 8160 
-aaAqgvvdt DVDTKDVIECL 

LP;DKVLLI PMLTQRMLLNVS 
CP=TRCC*YRC*HKGCY*MSQ 
8161 - AAACTTTCACATCACTCTGACTTAGAAGTGACAGGTGACAGTTGTAACAATTTCATGCTC - 8220 
-KLSHHSDLEVTGDSCNNFML 
-NFjHITLT*K*QVTVVTISCS 

tf;tsl*lrsdr*ql*qfhah 
8221 - acctataataaggttgaaaacatgacgcccagagatcttggcgcatgtattgactgtaat - 8280 

-TYljlKVENMT PRDLGACI DCN 

- P I j I R L K T * RPEILAHVLTVM 

L * i * G * KH DAQRSWRMY* L * C 
8281 - GCAAGGGATATCAATGCCCAAGTAGCAAAAAGTCACAATGTTTCACTCATCTGGAATGTA - 8340 
-ARHINAQVAKSHNVSLIWNV 
-QG ;ISMPK* QKVTMFHSSGM* 
KA ; YQCPSSKKSQCFTHLECK 
8341 - AAAGACTACATGTCTTTATCTGAACAGCTGCGTAAACAAATTCGTACTGCTGCCAAGAAG - 8 400 
-KDYMSLSEQLRKQIRTAAKK 

- KT-TCLYLNSCVNKFVLLPRR 

RLlHVFI *TAA*TNSYCCQEE 
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8401 - AACAAC-ATACCTTTTACACTAACTTGTGCTACAACTAGACAGGTTGTCAATGTCATAACT - 8460 

- N N jl PFTLTCATTRQVVNVIT 

- t t ! y l l h * lvlql drlsm s * l 

qh-tfytnlcyn* tgcqchny 
84 61 - actaaaatctcactcaagggtggtaagattgttagtacttgttttaaacttatgcttaag - 8520 

- t k k slkgg kivstcfklmlk 
-lk;sbsrvvrllvlvlnlclr 

*NiIiTQGW*DC*YLF*TYA*G 
8521 - GCCACATTATTGTGCGTTCTTGCTGCATTGGTTTGTTATATCGTTATGCCAGTACATACA - 858 0 
-ATiLLCVLAALVCYIVMPVHT 

- PH;YCAFLLHWFV I SLCQYIH 

HI. IVRSCCIGLLYRYASTYI 
8581 - TTGTCAATCCATGATGGTTACACAAATGAAATCATTGGTTACAAAGCCATTCAGGATGGT - 8640 
-LS I HDGYTNE I IGYKAIQDG 

- CQ SMMVTQMKSLVTKPFRMV 

VN;P*WLHK*NHWLQSHSGWC 
8641 - GTCACTCGTGACATCATTTCTACTGATGATTGTTTTGCAi\ATAAACATGCTGGTTTTGAC - 8700 
-VT^DII STDDCFANKHAGFD 

- S L V T S FLLM IVLQIttMLVLT 

hs!*hhfy* * l f c k * tcwf*r 
87 01 - gcatggitttagccagcgtggtggttcatacaaaaatgacixaaagctgccctgtagtagct - 87 60 
-awfsqrggsykndkscpvva 
hgilasvvvhtkmtkaal* * l 
mvj*pawwfiqk*qklpcssc 
87 61 - gctatcattacaagagagattggtttcatagtgcctggcttaccgggtactgtgctgaga - 8820 
-aixtreigfivpglpgtvlr 
-ls:lqerlvs *clayrvlc*e 
yhiykrdwfhsawltgycaes 
8 821 - gcaatcj\atggtgacttcttgcattttctacctcgtgtt"ttagtgctgttggcaacatt - 8880 
-a i ngdflhflprv fsavgni 
-qs'mvtsci fylvflvllatf 
nq;w*llafstscf*ccwqhl 
8 881 - tgctacacaccttccaaactcattgagtatagtgattttgctacctctgcttgcgttctt - 894 0 
-cytpsklieysdfatsacvl 
-at : hlpnsls ivillpllafl 

LHjTFQTH*V* *FCYLCLRSC 
8941 - GCTGCTGAGTGTACAATTTTTAAGGATGCTATGGGCAAACCTGTGCCATATTGTTATGAC - 9000 
-AA&CTIFKDAMGKPVPYCYD 

- L L S VQFLRM LWAN LCH I V MT 

c*;vynf*gcygqtcaill*h 
9001 - actaatttgctagagggttctatttcttatagtgagcttcgtccagacactcgttatgtg - 9060 

- t n llegs i s yselrpdtryv 
-li|c*rvlflivsfvqtlvmc 

*f;argfyfl**assrhslca 
9061 - cttatggatggttccatcatacagtttcctaacacttacctggagggttctgttagagta - 9120 
-lm d)gs i iqfpntylegsvrv 
-lw:mvpsysfltltwrvlle* 
yg;wfhhtvs*hlpggfc*ss 
9121 - gtaacaacttttgatgctgagtactgtagacatggtacatgcgaaaggtcagaagtaggt - 918 0 
-vttfdaeycrhgtcersevg 

- *q;llmlstvdmvhakgqk*v 

n n ; f * c * v l * twymrkvrs ry 
9181 - atttgcdtatctaccagtggtagatgggttcttaataatgagcattacagagctctatca - 924 0 
-i clstsgrwvlnnehyrals 
-fatlpvvdgflimsitelyq 
lp:iyqw*mgs***alqssir 
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92 41 - GGAGTTiTTCTGTGGTGTTGMGCGATGAATCTCATAGCTAACATCTTTACTCCTCTTGTG - 9300 

- G V F C GV DAMNLIANI ftplv 
~EFSVVLMR*I5*LTSLLLLC 

SFLWC*CDESHS*HLYSSCA 
9301 - CAACCTGTGGGTGCTTTAGATGTGTCTGCTTCAGTAGTGGCTGGTGGTATTATTGCCATA - 9360 
-QPVGALDVSASVVAGGIIAI 

- N L , W V L * MCLLQ * WLVVL LPY 

TC GCFRCVCFSSGWWYYCHI 
9361 - TTGGTGACTTGTGCTGCCTACTACTTTATGAAATTCAGACGTGTTTTTGGTGAGTACAAC - 9420 
-LVTCAAYY FMKFRRV FGEYN 
-W* i LVLPTTL*NSDVFIjVSTT 
GD'LCCLLLYEIQTCFW*VQP 
9421 - CATGTTGTTGCTGCTAATGCACTTTTGTTTTTGATGTCTTTCACTATACTCTGTCTGGTA - 9480 
-HVVAANALLFLMS F T I LCLV 
-MLLLLMHFCF*CLSLYSVWY 
C C. CC*CTFVFDVFHYTLSGT 

94 81 - CCAGCTTACAGCTTTCTGCCGGGAGTCTACTCAGTCTTTTACTTGTACTTGACATTCTAT - 954 0 

- p a s flpgvysvfylyltfy 
-ql'tafcrestqsftct^hsi 

SLQLSAGSLLSLLLVLDILF 

95 41 - TTCACC^ATGATGTTTCATTCTTGGCTCACCTTCAATGGTTTGCCATGTTTTCTCCTATT - 9600 

- F T N D V 5 FLAHLQWFAMFS P I 
-SP-MMFH3WLTFNGLPCFLLL 

HQ*CFI.LGSPSMVCHVFSYC 
9601 - GTGCCTTTTTGGATAACAGCAATCTATGTATTCTGTATTTCTCTGAAGCACTGCCATTGG - 9660 

- V P F W ITAIYVFCI8LKHCHW 
-CL-FG*QQSMYSVFL* STAIG 

AFLDNSNLCILYFSEALPLV 
9 661 - TTCTTTAACAACTATCTTAGGAAAAGAGTCATGTTTAATGGAGTTACATTTAGTACCTTC - 9720 
-FFWNYLRKRVMFNGVTFSTF 
S L T T I LGKE SCLMEL HLV PS 
L*;QLS*EKSHV*WSYI*YIiR 
9721 - GAGGAGGCTGCTTTGTGTACCTTTTTGCTCAACAAGGAAATGTACCTAAAATTGCGTAGC - 9780 
-EEAALCTFLLUKEMYLKLRS 

- RR LL CV P FCST RKCT * NCVA 

GG : CFVYLFAQQGNVPKIA*R 
9781 - GAGACACpTGTTGCCACTTACACAGTATAACAGGTATCTTGCTGTATATAACAAGTACAAG - 9840 
-ETLL PLTQYNRY LALYNKYK 
-RH:CC HLHS I T G I LLY I T STS 
DT VATYTV*QVSCSI*QVQV 
9841 - TATTTCAGTGGAGCCTTAGATACTACCAGCTATCGTGAAGCAGCTTGCTGCCACTTAGCA - 9900 

- Y FS GALDTT SYREAACCHLA 

- I S V E P* ILPAIVKQLAAT*Q 

FQ.WSLRYYQLS* SSLLPLSK 
9901 - AAGGCTCTAAATGACTTTAGCAACTCAGGTGCTGATGTTCTCTACCAACCACCACAGACA - 9960 
-KAtNDFSNSGADVI. YQPPQT 

- R L . * M TLATQVLMFS TNHHRH 

GSK*L*QLRC*CSLPTTTDI 
9961 - TCAATCACTTCTGCTGTTCTGCAGAGTGGTTTTAGGAAAATGGCATTCCCGTGAGGCAAA - 10020 

- S ITS AVLQSGFRKMAFPSGK 

- Q S LL LFCRVVLGKWH S RQAK 

NH'FCCSAEWF*ENGIPVRQS 
10021 - GTTGAAOGGTGCATGGTACAAGTAACCTGTGGAACTACAACTCTTAATGGATTGTGGTTG - 10080 
-VEGCMVQVTCGTTTLNGLWL 

-lk*gawyk*pvblqliimdcgw 
*r"vhgtsnlwnyns*wivvg 
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10081 - GATGACACAGTATACTGTCCAAGACATGTCATTTGCACAGCAGAAGACATGCTTAATCCT - 10140 
-DDTVYCPRHVICTAE DMLNP 
-MT'QY TV Q DMS FAQQ KT C L I L 
*HSILSKTCHLHSRRHA*S* 
10141 - AACTATGAAGATCTGCTCATTCGCAAATCCAACCATAGCTTTCTTGTTCAGGCTGGCAAT ~ 10200 
-NY E DLL IRKSNHSFLVQAGN 
-TMKICSFANPTIAFLFRLAM 
L* ! RSAHSQIQP*LSCSGWQC 
10201 - GTTCAApTTCGTGTTATTGGCCATTCTATGCAAAATTGTCTGCTTAGGCTTAAAGTTGAT - 10260 
-VQLRV I 6HSMQNCLLRLKVD 
" FN FVLLAILCKIVCLGLKLI 
ST SCYWPFYAKLSA*A*S*Y 
102 61 - ACTTCTAACCCTAAGACACCCAAGTATAAATTTGTCCGTATCCAACCTGGTCAAACATTT - 10320 
-TSNPKTPKYKFVRXQPGQTF 
-LLTLRHPSINLSVSNLVKHF 
F* P*DTQV*ICPYPTWSNIF 
10321 - TCAGTTPTAGCATGCTACAATGGTTCACCATCTGGTGTTTATCAGTGTGCCATGAGACCT - 10380 
-SV3LACYNGS PSGVYQCAHRP 
-QF *HATMVHHLVFI SVP * DL 
SS. SMLQWFTIWCLSVCHET* 
10381 - AATCATACCATTAAAGGTTCTTTCCTTAATGGATCATGTGGTAGTGTTGGTTTTAACATT - 10440 
-NHTIKGSFLNGSCGSVGFNI 
-IIPLKVLS LMDHVVVLVLTL 
SY^H*RFFP*WIMW*CWF*H* 
10441 - GATTATGATTGCGTGTCTTTCTGCTATATGCATCATAT6C3AGCTTCCAACAGGAGTACAC - 10500 

-dy dcvs fcymhhmelptgvh 
-im.iaclsaigiiwsfqqeyt 
l*;lrvfllyasygasnrstr 
10501 - gctggtactgacttagaaggtaaattctatggtccatttgttgacagacaaactgcacag - 10560 
-agtdlegkfygpfvdrqtaq 

- l v - l t * kvnsmvhllt dklhr 

w y. *lrr*ilwsic*qtnctg 
10561 - gctgcaggtacagacacaaccataacattaaatgttttggcatggctgtatgctgctgtt - 10620 
-aagt dtt itlnvlawlyaav 
-lq ;vqtqp*h*mfwhgcmlll 
cryrhnhnikcfgmavcccy 

10 621 - ATCAATCSGTGArAGGTGGTTTCTTAATAGATTCACCACTACTTTGAATGACTTTAACCTT - 10680 

-ingdrwflnrftttlndf nl 
-sm;viggflidspll*mtltl 
qw;**vvs**ihhyfe*l*pc 
10 681 - gtggcaatgaagtacaactatgaacctttgacacaagatcatgttgacatattgggacct - 10740 
-vamkynyepltqdhvdil, gp 

- wq : ^sttmnl*hkimltywdl 

gn.evql*tfdtrsc*higts 

10741 - CTTTCTGCTCAAACAGGAATTGCCGTCTTAGATATGTGTGCTGCTTTGAAAGAGCTGCTG - 108 00 
-LSAQT GIAVLDMCAALKE LL 
-FL LKQELPS*ICVLL*KSCC 
FC SNRNCRLRYVCCFERAAA 

10801 - CAGAATGGTATGAATGGTCGTACTATCCTTGGTAGCACTATTTTAGAAGATGAGTTTACA - 108 60 
-QNQMNGRT ILGSTILEDEFT 

- R M "V * MVVLSLVALF * KMSLH 

EWYEWS YYPW*HYFRR*VYT 
10861 - CCATTTGATGTTGTTAGACAATGCTCTGGTGTTACCTTCCAAGGTAAGTTCAAGAAAATT - 10920 
-PFDVVRQCSGVTFQGKFKKI 
-HLMLLDNALVLPSKVS SRKL 

I*"CC*TMLWCYLPR + VQBNC 
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10921 - GTTAAGGGCACTCATCATTGGATGCTTTTAACTTTCTTGACATCACTATTGATTCTTGTT - 1098 0 
-VKJSTHHWMLLTFLTSLLILV 
-LR ALI I G C F * L S * H H Y * F L F 
*G"HSSLDAFNFLDITI DSCS 

10981 - CAAAGTACACAGTGGTCACTGTTTTTCTTTGTTTACGAGAATGCTTTCTTGCCATTTACT - 1104 0 
"QSJQWSLFFFVYENAFLPFT 

- KV.HSGHC FSLFTRMLS CHLL 

KY. TVVTVFLCLREC FLAI Y S 
11041 - CTTGGTATTATGGCAATTGCTGCATGTGCTATGCTGCTTGTTAAGCATAAGCACGCATTC - 11100 
-LGIMAIAACAMLLVKHKHAF 

- LV LWQLLHVLCCLLS I S THS 

WY YGNCCMCYAAC*A*ARIL 
11101 - TTGTGCTTGTTTCTGTTACCTTCTCTTGCAACAGTTGCTTACTTTAATATGGTCTACATG - 11160 
-LCJLsFLLPS latvayfnmvym 

-cacfcylllqqlltliwstc 
vl:vsvtfscnscll*yglha 
11161 - cctgctagctgggtgatgcgtatcatgacatggcttgaattggctgacactagcttgtct - 1122 0 
-paswvmrimtwleladtsls 

- l l ; a g * cvs *hglnwltlacl 

c*;lgdayhdma*ig*h*lvw 
11221 - ggttataggcttaaggattgtgttatgtatgcttcagctttagttttgcttattctcatg - 11280 
-gyrlkdcvmyasalvllilm 
-viglrivlcmlql**fclfs* 
l*"a*glcyvcfsfsfayshd 
11281 - acagctcgcactgtttatgatgatgctgctagacgtgtttggacactgatgaatgtcatt - 11340 
-tartvyddaarrvwtlmnvi 
-ql.alfmmmlldvfgh* * m s l 
ss;hcl**cc*tcldtdechy 
11341 - acacttgtttacaaagtctactatggtaatgctttagatcaagctatttccatgtgggcc - 11400 

-TLVYKVYYGNALDQAISMWA 
-HL FTKSTMVML * I KLFPCGP 
TC:LQSLLW*CFRSSYFHVGL 
11401 - TTAGTT^TTTCTGTAACCTCTAACTATTCTGGTGTCGTTACGACTATCATGTTTTTAGCT - 11460 

- L V I SVTSNY SGVVTTIMFLA 

* L'FL* P LT I LV SLRLS C F * L 
SY-FCNL*LFWCRYDYHVFS* 
114 61 - AGAGCTATAGTGTTTGTGTGTGTTGAGTATTACCCATTGTTATTTATTACTGGCAACACC - 11520 
-RAIVFVCVE YYPLLFITGNT 
-EL : *CLCVLSITHCYLLLATP 
S Y : S V C V C * VLPIVI YYWQHL 
11521 - TTACAGTGTATCATGCTTGTTTATTGTTTCTTAGGCTATTGTTGCTGCTGCTACTTTGGC - 11580 
-LQqiMLVYCFLGYCCCCYFG 

- YS'VSCLFIVS^AIVAAATLA 

tv yhacll flrlllllllwp 
11581 - cttttctgtttactcaaccgttacttcaggcttactcttggtgtttatgactacttggtc - 11640 
-lfcllnryfrltlgvydylv 

- fs'vystvtsglllvfmttws 

fl:ftqpllqayswcl*llgl 
11641 - tctacaqiaagaatttaggtatatgaactcccaggggcttttgcctcctaagagtagtatt - 11700 
-stqefrymnsqgllppks s i 

- lh knlgi * tprgfcllrvvl 

ytri*vyelpgafas*e*y* 
11701 - gatgct^tcaagcttaacattaagttgttgggtattggaggtaaaccatgtatcaaggtt - 11760 
-dafklnikllgi ggkpcikv 
-ml:ssltlscwvlevnhvsrl 

c f" q a * h*vvgywr*tmyqgc 
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117 61 - GCTACTpTACAGTCTAAAATGTCTGACGTAAAGTGCACATCTGTGGTACTGCTCTCGGTT - 1182 0 
-ATVQSKMS DVKCTSVVLLSV 

- LLYSLKCLT * SAHLWYCSRF 

YC:TV*NV*RKVHICGTALGS 
11821 - CTTCAACJAACTTAGAGTAGAGTCATCTTCTAAATTGTGGGCACAATGTGTACAACTCCAC - 11880 
-LQQLRVES S SKLWAQCVQLH 

- FNNLE*SHLLNCGHNVYNST 

ST;T*SRVIF*IVGTMCTTPQ 
11881 - AATGATATTCTTCTTGCAAAAGACACAACTGAAGCTTTCGAGAAGATGGTTTCTCTTTTG - 11940 
-NDILLAKDTTEAFEKMVSLL 

- M I'FFLQKTQLKLS RRWFLFC 

*YSSCKRHN* S FREDGFSFV 
11941 - TCTGTTiFTGCTATCCATGCAGGGTGCTGTAGACATTAATAGGTTGTGCGAGGAAATGCTC - 12000 
-SVLLSMQGAVDI NRLCEEML 
LFCYPCRVL* TLIGCARKCS 
CFAIHAGCCRH* *VVRGNAR 
12001 - G ATAACCGTGCT ACTCTTC AGGCTATT GCTTCAGAATTT AGT TCTT TACC ATCATAT GCC - 12060 
-DNRATLQA IASEFSSLPSYA 

- IT VLLFRLLLQNLVLYHHMP 

*PiCYSSGYCFRI*FFTIICR 
12061 - GCTTATGCCACTGCCCAGGAGGCCTATGAGCAGGCTGTAGCTAATGGTGATTCTGAAGTC - 12120 
-AYATAQEAYEQAVANG DSEV 

- LM-P L PR R P M S R L * LMVI LKS 

LC;HCPGGL*AGCS*W*F*SR 
12121 - GTTCTCAAAAAGTTAAAGAAATCTTTGAATGTGGCTAAATCTGAGTTTGACCGTGATGCT - 12180 
-VLKKLKKSLNVAKSEFDRDA 
F S ! K S * R N L * M W L N L S LTVML 
SQ;KVKEIFECG*I*V*P*CC 
12181 - GCCATG(bAACGCAAGTTGGAAAAGATGGCAGATCAGGCTATGACCCAAATGTACAAACAG - 12240 
-AMQRKLEKMADQAMTQMYKQ 

- PC NASWKRWQIRL*PKCTNR 

HA:TQVGKDGRSGYDPNVQTG 
12241 - GCAAGATCTGAGGACAAGAGGGCAAAAGTAACTAGTGCTATGCAAACAATGCTCTTCACT - 12300 

- A R $ EDKRAKVT SAMQTMLFT 
-QD:LRTRGQK*LVLCKQCSSL 

K 1 : * GQEGKSN * CYANNALHY 
12301 - ATGCTT4gGAAGCTTGATAATGATGCACTTAACAACATTATCAACAATGCGCGTGATGGT - 123 60 

-mlrkldndalnn iinnardg 
-cl;gslimmhlttlstmrvmv 
a*ea***ct*qhyqqca*wl 

12361 - TGTGTTCJCACTCAACATCATACCATTGACTACAGCAGCCAAACTCATGGTTGTTGTCCCT - 12420 
-CVPLNI IPLTTAAKLMVVVP 

- V F HS TSYH * LQQPWSW LLS L 

CS : TQHHTI DYSSQTHGCCP* 
12 421 - GATTATGGTACCTACAAGAACACTTGTGATGGTAACACCTTTACATATGCATCTGCACTC - 124 8 0 
-DYQTYKNTCDGNT F TYASAL 

- IM;VPTRTLVMVTPLHMHLHS 

LW YLQEHL * W * HLYI CI CTL 
12481 - TGGGAAATCCAGCAAGTTGTTGATGCGGATAGCAAGATTGTTCAACTTAGTGAAATTAAC - 12540 
-WEI QQVVDADSKIVQLSEIN 
-GKSSKLLMRIARLFNLVKLT 
GN PASC*CG*QDCST**N*H 
12541 - ATGGACAATTCACCAAATTTGGCTTGGCCTCTTATTGTTACAGCTCTAAGAGCCAACTCA - 12600 
-MDNSPNLAWPLIVTALRANS 
~WTIHQIWLGLLLLGL*EPTQ 
GQFTKFGLASYCYSSKSQLS 
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12601 - GCTGTTAAACTACAGAATAATGAACTGAGTCCAGTAGCACTACGACAGATGTCCTGTGCG - 12660 
-AVKLQNNELS PVALRQMSCA 

- LL;NYRIMN*VQ* HYDRCPVR 

C * : T T E * *TESSSTTTDVLCG 
12661 - GCTGGTACCACACAAACAGCTTGTACTGATGACAATGCACTTGCCTACTATAACAATTCG - 12720 
-AG XTQTACTDDNALAYYNNS 
-LV PHKQLVLMTMHLPTITIR 
WY'HTNSLY* *QCTCLL*QFE 
12721 - AAGGGA(3GTAGGTTTGTGCTGGCATTACTATCAGACCACCAAGATCTCAAATGGGCTAGA - 127 80 
-KGGRFVLALLS DHQDLKWAR 

- RE VG LCWHYYQTTKI SWGLD 

GR*VCAGITIRPPRSQMG*I 
12781 - TTCCCTAAGAGTGATGGTACAGGTACAATTTACACAGAACTGGAACCACCTTGTAGGTTT - 12840 
-FPKS DGTGT IYTELEPPCRF 

- s l'rvmvqvq ft QN VJ n hlv GL 

P*E*WYRYNLHRTGTTL*VC 
128 41 - GTTACAGACACACCAAAAGGGCCTAAAGTGAAATACTTGTACTTCATCAAAGGCTTAAAC - 12 900 

- V T DT PKGPKVKYLY FI KGLN 

- LQ"THQKGLK*NTCTSSKA*T 

YR;HTKRA* SE I LVLHQRLKQ 
12901 - AACCTAAATAGAGGTATGGTGCTGGGCAGTTTAGCTGCTACAGTACGTCTTCAGGCTGGA - 12 960 
-NLtfRGMVLGSLAATVRLQAG 
-T^IEVWCWAV* LLQYVFRLE 
PK *RYGAGQFSCYSTSSGWK 
12961 - AATGCT^CAGAAGTACCTGCCAATTCAACTGTGCTTTCCCTCTGTGCTTTTGCAGTAGAC - 13020 
-NATEVPANSTVLSFCAFAVD 
-ML-QKYLPIQLCFPSVLLQ*T 
CY RS TCQFNCAFLLCFCSRP 
13021 - CCTGCTAAAGCATATAAGGATTACCTAGCAAGTGGAGGACAACCAATCACCAACTGTGTG - 13080 
-PAKAYKDYLASGGQPI TMCV 
-LLKHIRIT*QVEDNQSPTV* 
C*.SI*GLPSKWRTTNHQLCE 
13081 - AAGATGTTGTGTACACACACTGGTACAGGACAGGCAATTACTGTAACACCAGAAGCTAAC - 13140 
-KMLCTHTGTGQAITVT PBAN 
-RCiCVHTLVQDRQLL^HQKLT 
DV VYTHWYRTGNYCNTRS*H 
13141 - ATGGACQAAGAGTCCTTTGGTGGTGCTTCATGTTGTCTGTATTGTAGATGCCACATTGAC - 13200 
-MDQES FGGASCCLYCRCHID 
-WT'KSPLVVLHVVCIVDATLT 
GP : RVI)WWCFMLSVL*MPH*P 
13201 - CATCCAAATCCTAAAGGATTCTGTGACTTGAAAGGTAAGTACGTCCAAATACCTACCACT - 132 60 
-HPN PKGFCDLKGKYVQI PTT 

- IQILKDSVT*KVSTSKYLPL 

SK i S*RIL*LER*VRPNTYHL 
13261 - TGTGCTAATGACCCAGTGGGTTTTACACTTAGAAACACAGTCTGTACCGTCTGCGGAATG - 13320 
-CANDPVGFTLRNTVCTVCGM 
-VL-MTQWVLHLETQSVPSAEC 
C*:* P S G F Y T * KHSLYRLRNV 
13321 - TGGAAAQGTTATGGCTGTAGTTGTGACCAACTCCGCGAACCCTTGATGCAGTCTGCGGAT - 13380 
-WKGYGCSCDQLRE PLMQSAD 
-GKVMAVVVTNSANP*CSLRM 
E R L W L * L * PTPRTLDAVCGC 
133 81 - GCATCAACGTTTTTAAACGGGTTTGCGGTGTAAGTGCAGCCCGTCTTACACCGTGCGGCA - 134 40 
-AST FLNGFAV* VQPVLHRAA 
-HQ-RF* TGLRCKCSPSYTVRH 
IU'VFKRVCGVSAARLTPCGT 
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13441 - CAGGCACTAGTACTGATGTCGTCTACAGGGCTTTTGATATTTACAACGAAA2\AAGTGCTG - 13500 
-QALVLMSSTGLLI FTTKKVL 

- R H : * Y * C R L Q G F * YLQRKKCW 

GT ST DVVYRAFDI YNE KSAG 
13501 - GTTTTGCAAAGTTCCTAAAAACTAATTGCTGTCGCTTCCAGGAGAAGGATGAGGAAGGCA - 13560 
-VLQS S *KLIAVASRRRMRKA 

- FCKVPKN*LLSLPGEG*GRQ 

FA KFLKTNCCRFQEKDEEGN 
135 61 - ATTTATTAGACTCTTACTTTGTAGTTAAGAGGCATACTATGTCTAACTACC^VACATGAAG - 13 620 
-IY*TLTL*LRGILCLTTNMK 
-FIRLLLCS*EAYYV*LPT*R 
LLDSYFVVKRHTMSNYQHEE 
13 621 - AGACTATTTATAACTTGGTTAAAGATTGTCCAGCGGTTGCTGTCCATGACTTTTTCAAGT - 13 68 0 
-RLFI TWLKIVQRLLSMTFSS 
-DYL*LG*RLSSGCCP*LFQV 
TI YNLVKDCPAVAVHDFFKF 
13681 - TTAGAGTAGATGGTGACATGGTACCACATATATCACGTCAGCGTCTAACTAAATACACAA - 13740 
-LE*MVTWYKIYHVSV*LNTQ 
* S R W * HGTTYI T S A S N * I HN 
RVDGDMVPHISRQRLTKYTM 
13741 - TGGCTGATTTAGTCTATGCTCTACGTCATTTTGATGAGGGTAATTGTGATACATTAAAAG - 13800 

- W L I * SMLYVILMRVIVIH*K 
-G*"FSLCSTSF**G*L*YIKR 

AD LV YALRH F DE GNC D T LKE 
138 01 - AAATACTCGTCACATACAATTGCTGTGATGATGATTATTTCAATAAGAAGGATTGGTATG - 138 60 
-KYSSHTIAVMMI IS IRRIGM 
-NT RHIQLL** * L F Q * EGLV* 
IL. VTYNCCDDDYFNKKDWYD 
13 8 61 - ACTTCGTAGAGAATCCTGACATCTTACGCGTATATGCTAACTTAGGTGAGCGTGTACGCC - 13920 

- T S * RI LTSYAYMLT*VSVYA 

-lr res *hltric* lr* actp 
fv en p di lrvyanlgervrq 
13921 - aatcattattaaagactgtacaattctgcgatgctatgcgtgatgcaggcattgtaggcg - 13 980 
-nhy * rlynsamlcvmqaii* a 
i i ikdctilrcya*crhcrr 
sl lktvqfcdamr dagivgv 
13981 - tactgacattagataatcaggatcttaatgggaactggtacgatttcggtgatttcgtac - 14040 
-y*h*iirilmgtgtisvisy 
"td-ir*sgs*w-elvrfr*frt 
lt;ldnqdlngnwydfgdfvq 
14041 - aagtagcaccaggctgcggagttcctattgtggattcatattactcattgctgatgccca - 14100 

- K * HQAAEFLLWI hi t h c * c p 
-SS : TRLRSSYCGFILLIADAH 

VA PGCGVPIVDSYYSLLMPI 
14101 - TCCTCAGTTTGACTAGGGCATTGGCTGCTGAGTCCCATATGGATGCTGATCTCGCAAAAC - 14160 
-SSL*LGHWLLSPIWMLISQN 
-PH : FD*GIGC*VPYGC* SRKT 
LT.LTRALAAESHMDADLAKP 
14161 - CACTTATTAAGTGGGATTTGCTGAAATATGATTTTACGGAAGAGAGACTTTGTCTCTTCG - 14220 
-HLI»SGIC*NMILRKRDFVSS 
-TY ,*VGFAEI*FYGRETLSLR 
LI KWDLLKYDFTEERLCLFD 
14221 - ACCGTTATTTTAAATATTGGGACCAGACATACCATCCCAATTGTATTAACTGTTTGGATG - 14280 
-TVILNIGTRHTIPIVLTVWM 

- P L F * ILGPDIPSQLY* L F G * 

RY FKYWDQTYHPNCINCLDD 
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14281 - ATAGGTGTATCCTTCATTGTGCAAACTTTAATGTGTTATTTTCTACTGTGTTTCCACCTA - 14340 
-IGVSFIVQTLMCYFLLCFHL 
-*VYPSLCKL*CVIFYCVSTY 
RC'ILHCANFNVLFSTVFPPT 

14341 - CAAGTTTTGGACCACTAGTAAGAAAAATATTTGTAGATGGTGTTCCTTTTGTTGTTTCAA - 14400 
-QVLDH**EKYL*MVFLLLFQ 

k f " w t t skkn1 crwcs fccfn 
sfgplvrkifvdgvpfvvst 
14401 - ctggataccattttcgtgagttaggagtcgtacataatcaggatgtaaacttacatagct - 14460 
-ldti fvs * esyiirm* t y i a 
-wipfs*vrsrt*sgcklt*ii 
gy hfrelgvvhnqdvnlhss 
144 61 - cgcgtctcagtttcaaggaacttttagtgtatgctgctgatccagctatgcatgcagctt - 14520 
-rvsvsrnf*cmll:cqlcmql 
-asqfqgtfsvcc*ssyacsf 

R L SFKELLVYAADPAMHAAS 
14521 - CTGGCAATTTATTGCTAGATAAACGCACTACATGCTTTTCAGTAGCTGCACTAACAAACA - 14580 

- L A I Y C * INALHAFQ*IiH*QT 

W Q * F I A R * T HYMLFSS CTNKQ 
GN. LLLDKRTTCFSVAALTNN 
14581 - ATGTTGeTTTTCAAACTGTCAAACCCGGTAATTTTAATAAAGACTTTTATGACTTTGCTG - 14640 
-MLLFKLSNPVIL IKTFMTLL 
-CC FSNCQTR* F * * R L L * L C C 
VA FQTVKPGNFNKDFYDFAV 
14641 - TGTCTAAAGGTTTCTTTAAGGAAGGAAGTTCTGTTGAACTAAAACACTTCTTCTTTGCTC - 14700 
-CLKV SLRKEVLLN* NT SSLL 

- V * R F L * GRKFC*TKTLLLCS 

SK GFFKE GSSVELKHFFFAQ 
14701 - AGGATGCSCAACGCTGCTATCAGTGATTATGACTATTATCGTTATAATCTGCCAACAATGT - 14760 

-rmatllsv1mti iviicqqc 
-gw:qrcyq* l * l l s l * sannv 

dg:naaisdydyyrynlptmc 
14761 - gtgatatcagacaactcctattcgtagttgaagttgttgataaatactttgattgttacg - 14820 
-visdnsy s *lkllintlivt 
-*y qttpirs*sc**ii ) *llr 

di-rqllfvvevvdkyfdcyd 

14821 - ATGGTGGCTGTATTAATGCCAACCAAGTAATCGTTAACAATCTGGATAAATCAGCTGGTT - 14880 
-MVAVLMPTK* SLTIWINQLV 
-WWLY*CQPSNR*QSG* ISVJF 
GG CINANQVIVNNLDKSAGF 
14 881 - TCCCATtTAATAAATGGGGTAAGGCTAGACTTTATTATGACTCAATGAGTTATGAGGATC - 14940 
-S H &INGVRLiDFIMTQ * v m r i 
-PI-**MG* G*TLL*LNEL»*GS 
PF NKWGKARLYYDSMSYEDQ 
14941 - AAGATGQACTTTTCGCGTATACTAAGCGTAATGTCATCCCTACTATAACTCAAATGAATC - 15000 
-KMHFSRIL SVMS S L L * L K * I 
~RCTFRVY*A*CHPYYNSNES 
DA LFAYTKRNVIPTI TQMNL 
15001 - TTAAGTATGCCATTAGTGCAAAGAATAGAGCTCGCACCGTAGCTGGTGTCTCTATCTGTA - 15060 
~LSMPLVQRIELAP*LVSLSV 
-*V-CH*CKE*SSHRSWCLYIi* 
KY.AI SAKN RARTVAGVS I CS 
15061 - GTACTATGACAAATAGACAGTTTCATCAGAAATTATTGAAGTCAATAGCCGCCACTAGAG - 15120 
-VL*QIDSFIRNY*SQ*PPLE 
-YYDK*TVSSEIIEVNSRH*R 
TMTNRQFHQKLLKSIAATRG 
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15121 - GAGCTACTGTGGTAATTGGAACAAGCAAGTTTTACGGTGGCTGGCATAATATGTTAAAAA - 15180 
-EL : LW*LEQAS F T V A G I IC * K 
-SYCGNWNKQVLRWLA* YVKN 
ATVVIGTSKFYGGWHNMLKT 
15181 - CTGTTTACAGTGATGTAGAAACTCCACACCTTATGGGTTGGGATTATCCAAAATGTGACA - 15240 
-LFTVM* KLH TLWVGI IQNVT 
-CL:Q*CRNSTPYGLGLSKM*Q 
VY SDVETPHLMGWDYPKCDR 
15241 - GAGCCA^GCCTAACATGCTTAGGATAATGGCCTCTGTTGTTCTTGCTCGCAAACATAACA - 15300 

- E PCLTCLG* WPLLFLLAN IT 
-SHA*HA* DNGLSCSCSQT*H 

AM. PNMLRIMASLVLARKHNT 

153 01 - CTTGCTGTAACTTATCACACCGTTTCTACAGGTTAGCTAACGAGTGTGCGCAAGTATTAA - 15360 

-LAVTYHTV S T G * LTSVRKY* 
-LL"*LITPFLQVS*RVCASIK 
CC-NLSHRFYRLA NECAQVLS 
15361 - GTGAGATGGTCATGTGTGGCGGCTCACTATATGTTAAACCAGGTGGAACATCATCGGGTG - 15420 
-VRWSCVAAH YMLNQVE HH PV 

- * D:GHVWRLT IC* TRWNIIR* 

EMVMCGGSLYVKPGGTSSGD 
15421 - ATGCTACAACTGCTTATGCTAATAGTGTCTTTAACATTTGTCAAGCTGTTACAGCCAATG - 15480 
-MLQLLMLI VSLTFVKLLQPM 
-CY-NCLC**CL*HLSSCYSQC 

ATTAYANSVFNICQAVTANV 

154 81 - TAAATGCACTTCTTTCAACTGATGGTAATAAGATAGCTGACAAGTATGTCCGCAATCTAC - 1554 0 

-*MHFFQLMV I R * LTSMSAIY 
-KCTSFN*W**DS*QVCPQST 
NALLST DGNKIADKYVRNLQ 

15541 - aacacaggctctatgagtgtctctatagaaatagggatgttgatcatgaattcgtggatg - 15600 
-ntgsmsvsieigmlimnswm 
-tqal*vsl*k*gc*s*irg* 
hr'lyeclyrnrdvdhefvde 

15601 - agttttacgcttacctgcgtaaacatttctccatgatgattctttctgatgatgccgttg - 15660 

-SFTLTCVNISP* *FFLMMPL 
-VliRLPA^TFLHODSF* * C R C 
FYAYLRKHFSMMILSDDAVV 
15661 - TGTGCTATAACAGTAACTATGCGGCTCAAGGTTTAGTAGCTAGCATTAAGAACTTTAAGG - 15720 
-CAITVTMRLKV* * L A L R T L R 

- V L *Q*LCGSRFSS*H*EL*G 

CY NSNYAAQGLVAS I KNFKA 
15721 - CAGTTCTTTATTATCAAAATAATGTGTTCATGTCTGAGGCAAAATGTTGGACTGAGACTG - 15780 
-QFFIIKJMCSCLRQNVGLRL 

- S S LLSK* CVHV * GKMLD* D * 

VBYYQNNVFMSEAKCWTETD 
15781 - ACCTTAfcTAAAGGACCTCACGAATTTTGCTCACAGCATACAATGCTAGTTAAACAAGGAG - 15840 
-TLlKDLTNFAHS IQC*LNKE 

- P Y * RTSRI LLTAYNAS * T R R 

IiT KGPHEFCSQHTMLVKQGD 
15841 - ATGATTACGTGTACCTGCCTTACCCAGATCCATCAAGAATATTAGGCGCAGGCTGTTTTG - 15900 
-MI TCTCLTQI HQEY*AQAVL 

- * LRVPALPRSIKNIRRRLFC 

DY VYLPYPDPSRILGAGCFV 
15 901 - TCGATGATATTGTCAAAACAGATGGTACACTTATGATTGAAAGGTTCGTGTCACTGGCTA - 15960 

- S M XLSKQMVHL* LKGSCHWL 
-R*YCQNRWYTYD*KVRVTGY 

DDIVKTDGTLMIERFVSLAI 
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159 61 - TTGATGCTTACCCACTTACAAAACATCCTAATCAGGAGTATGCTGATGTCTTTCACTTGT - 16020 
-IiMtiTHIiQNILIRSMLMSFTC 
-*CLPTYKTS*SGVC*CLSLV 
DAYPLTKHPNQEYADVFHLY 
16021 - ATTTACAATACATTAGAAAGTTACATGATGAGCTTACTGGCCACATGTTGGACATGTATT - 16080 
-IYNTLESYMMSLLATCWTCI 
-FTIH*KVT* *AYWPHVGHVF 
LQ.YIRKLHDELTGHMLDMYS 
16081 - CCGTAAFGCTAACTAATGATAACACCTCACGGTACTGGGAACCTGAGTTTTATGAGGCTA - 16140 
-p* C* LMITPHGTGNLS FMRL 
R N A N * **HLTVLGT* V L * G Y 
VM.LTNDNTSRYWEPEFYEAM 
16141 - TGTACACACCACATACAGTCTTGCAGGCTGTAGGTGCTTGTGTATTGTGCAATTCACAGA - 16200 
-CTHH IQSCRL* VLVYCA2HR 
-VHTTYSLAGCRCLCIVQFTD 
YT PHTVLQAVGACVLCNSQT 
162 01 - C T TCACTT CGTT GCGGTGCCTGT ATTAGGAGACC ATTCCT AT GT TGCAAG TGCTGCT ATG - 16260 

- L H FVAVPVLGDHSYVASAAM 

- FT S LRCL Y * E TI PMLQVLL* 

SL-RCGACIRRPFLCCKCCYD 
162 61 - ACCATGTCATTTCAACATCACACAAATTAGTGTTGTCTGTTAATCCCTATGTTTGCAATG - 16320 

- T M 5 FQHHTN * C C L Ij I PMFAM 

P C.H FN I TQI SVVC * SLGLQC 
HVilSTSHKLVLSVNPYVCNA 

16321 - ccccaggttgtgatgtcactgatgtgacacaactgtatctaggaggtatgagctattatt - 16380 
-pqvvmslm*hnci*ev*aii 
-pr;l*ch*cdttvsrryelll 

PG CDVTDVTQLYLGGMSYYC 
16381 ~ GCAAGTCACATAAGCCTCCCATTAGTTTTCCATTATGTGCTAATGGTCAGGTTTTTGGTT - 16440 
-AS U I SLPLVFHYVLMVRFLV 
-QVT*ASH*FSIMC*WSGFWF 
KS HKPPI SFPLCANGQVFGL 
16441 - TATACAAAAACACATGTGTAGGCAGTGACAATGTCACTGACTTCAATGCGATAGCAACAT - 16500 
-YTKTHV*AVTMSLTSMR*QH 
-IQKHMCRQ* Q C H * LQCDSNM 
YK-NTCVGSDNVT DFNAIATC 
16501 - GTGATTGGACTAATGCTGGCGATTACATACTTGCCAACACTTGTACTGAGAGACTCAAGC - 16560 
-VIGLMLAITYLPTI. VLRDSS 

- * L D * CWRLH TCQHLY * BTQA 

DW TNAG DY I LAN TCT ERLKL 
16561 - TTTTCGGAGCAGAAACGCTCAAAGCCACTGAGGAAACATTTAAGCTGTCATATGGTATTG - 16620 
-FSQQKRSKPLRKHLSCHMVL 

- FR'SRNAQSH*GNI*AVIWYC 

FA;AE T LKATE E T FKL SY G IA 
16621 - CCACTGTACGCGAAGTACTCTCTGACAGAGAATTGCATCTTTCATGGGAGGTTGGAAAAC - 16680 
-PLYAKYSLTENCIF HGRLEN 

-hc;trstl*qriasfmggwkt 
tvrevlsdrelhlswevgkp 

16681 - CTAGACCACCATTGAACAGAAACTATGTCTTTACTGGTTACCGTGTAACTAAAAATAGTA - 16740 
-LDHH*TETMSLLVTV*LKIV 
* T TIEQKLCLYW L P C N * K * * 
RP PLNRNYVFTGYRVTKNSK 
16741 - AAGTACAGATTGGAGAGTACACCTTTGAAAAAGGTGACTATGGTGATGCTGTTGTGTACA - 16800 
-KYRLESTPLKKVTMVMLLCT 

- ST-DWRVHL* K R * L W * CCCVQ 

VQIGEYTFEKGDYGDAVVYR 
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16801 - GAGGTACTACGACATACAAGTTGAATGTTGGTGATTACTTTGTGTTGACATCTCACACTG - 16860 
-EVLRHTS*MLVITIjC* HLTL 
-RY YDIQVECW*LLCVDISHC 
GTTTYKLNVGDYFVLTSHTV 
168 61 - TAATGCCACTTAGTGCACCTACTCTAGTGCCACAAGAGCACTATGTGAGAATTACTGGCT - 16920 
-*CHLVHLL* CHKSTM* ELLA 
-NAT* CTYSSATRALCENYWL 
MPLSAPTLVPQEHYVRITGL 
16921 - TGTACCCAACACTCAACATCTCAGATGAGTTTTCTAGCAiVTGTTGCAAATTATCAAAAGG - 16980 
-CTQHSTSQMSFLAMLQIIKR 
-VPNTQHLR*VF*QCCKLSKG 
YPTLNISDEFSSNVANYQKV 
16981 - TCGGCATGCAAAAGTACTCTACACTCCAAGGACCACCTGGTACTGGTAAGAGTCATTTTG - 17040 
-SACKSTLHSKDHLVLVRVIL 
-RHAKVLYTPRTTWYW * E S F C 
G M; Q K Y S TLQG P P GT GKS H FA 
17 041 - CCATCGGACTTGCTCTCTATTACCCATCTGCTCGCATAGTGTATACGGCATGCTCTCATG - 17100 
~PSDLLSITHLLA*CIRHALM 
-HRTCSLLPICSHSVYGMLSC 
IG : LALYYPSARIVYTACSHA 
17101 - CAGCTGTTGATGCCCTATGTGAAAAGGCATTAAAATATTTGCCCATAGATAAATGTAGTA - 17160 
-QLLMPYVKRH*NI CP* I N V V 

- S C * C PM * KG IKI FAHR * M * * 

AV;DALCEKALKYLP I DKCSR 
17161 - GAATCATACCTGCGCGTGCGCGCGTAGAGTGTTTTGATAAATTCAAAGTGAATTCAACAC - 17220 
-ESYLRVRA* SVL INS K * I Q H 
-NH TCACARRVF* * IQSEFNT 
II:PARARVECFDKFKVNSTL 
17221 - TAGAACAGTATGTTTTCTGCACTGTAAATGCATTGCCAGAAACAACTGCTGACATTGTAG - 17280 
-*NSMFSAL*MHCQKQLLTL* 
-RTVCFLHCKCIARN1SJC*HCS 
EQ YVFCTVNALPETTADIVV 
17281 - TCTTTGATGAAATCTCTATGGCTACTAATTATGACTTGAGTGTTGTCAATGCTAGACTTC - 17340 
-SLMKS LWLLIMT *VLSMLDF 

- L * *NLYGY*L*LECCQC*TS 

FDjEISMATNYDLSVVNARLR 
17341 - GTGCAA^ACACTACGTCTATATTGGCGATCCTGCTCAATTACCAGCCCCCCGCACATTGC - 17400 
-VQNTT SILAILLNYQPPAHC 
-C^KiTLRLYWRSCSITSPPHIA 
AKHYVY IGDPAQLPAPRTLL 
17401 - TGACTAAAGGCACACTAGAACCAGAATATTTTAATTCAGTGTGCAGACTTATGAAAACAA - 17460 

- * LKAH * NQNI LI QCADL*KQ 

D*-RHTRTRI F * FSVQTYENN 
TK GT LEPEYFNSVCRIjMKTI 
17461 - TAGGTCCAGACATGTTCCTTGGAACTTGTCGCCGTTGTCCTGCTGAAATTGTTGACACTG - 17520 

- * VQTCSLELVAVVLliKLLTL 

-rs rhv?wnlsplsc*nc*hc 
gpdm flgtcrrcpae ivdtv 
17521 - tgagtggtttagtttatgacaataagctaaaagcacacaaggataagtcagctcaatgct - 17580 

- * v ii * fmtis*khtri sqlna 
-ec-fsl*q*akstqg*vssml 

sa lvydnklkahkdksaqcf 
17581 - tcaaaat.gttctacaaaggtgttattacacatgatgtttcatctgcaatcaacagacctc - 17640 
-skcs tkvllhmmfhlq st dl 
-qn'vlqrcyyt *cficnqqts 

kmfykgvithdvssainrpq 
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17 641 - AAATAGGCGTTGTAAGAGAATTTCTTACACGCAATCCTGCTTGGAGAAAAGCTGTTTTTA - 17700 

-K*AL*ENFLHAILLGEKLFL 
-NRRCKRI SYTQSCLEKSCFY 
IGVVRB FLTRNPAWRKAVFI 
17701 - TCTCACCTTATAATTCACAGAACGCTGTAGCTTCAAAAATCTTAGGATTGCCTACGCAGA - 17760 

- S H t I IHRTL*LQKS*DCLRR 
-LTL* FTERCSFKNLRIAYAD 

SP'YNSQNAVASKILGLPTQT 
177 61 - CTGTTGATTCATCACAGGGTTCTGAATATGACTATGTCATATTCACACAAACTACTGAAA - 17 82 0 
-LLJHHRVLNMTMSY SHKLLK 
-C*FITGF* I*LCHIRTNY*N 
VDSSQGSEYDYVIFTQTTET 
17821 - CAGCACACTCTTGTAATGTCAACCGCTTCAATGTGGCTATCACAAGGGCAA/VAATTGGCA - 17880 
-QHfLVMS TASMWLSQGQKLA 

- S T L L * CQPLQCGYHKGKNWH 

AH : SCNVNRFNVAITRAKIGI 
17881 - TTTTGTGCATAATGTCTGATAGAGATCTTTATGACAAACTGCAATTTACAAGTCTAGAAA - 17 940 

- F C A * CLI EI FMTNCNLQV * K 
~FV.HNV**RSL*QTAIYKSRN 

LCIMS DRDLYDKLQFTSLEI 
17941 - TACCACGTCGCAATGTGGCTACATTACAAGCAGAAAATGTAACTGGACTTTTTAAGGACT - 18000 
-YHVAMWL HYKQKM* LDFLRT 
-TTSQCGYITSRKCNWTF* GL 

PR. RNVATLQAENVTGLFKDC 

18 001 - GTAGTAAGATCATTACTGGTCTTCATCCTACACAGGCACCTACACACCTCAGCGTTGATA - 18060 

-VVRSLLV FI1HRHLHT SALI 
-**DHYWSSSYTGTYTPQR*Y 

SK I I tg'bhptqapthlsvdi 
18061 - taaaattcaagactgaaggattatgtgttgacataccaggcataccaaaggacatgacct - 18120 
-*n$rlkdyvltyqayqrt* p 
-ki.qd*rimc*htrhtkghdl 
kf kteglcvdipgipkdmty 
18121 - accgtagactcatctctatgatgggtttcaaaatgaattaccaagtcaatggttacccta - 18180 
-tvi?ssl*wvsk* itksmvtii 
~-p*:thlydgfqnelpsqwlp* 
rrlismmgfkmnyqvngypn 
18181 - atatgtttatcacccgcgaagaagctattcgtcacgttcgtgcgtggattggctttgatg - 18240 
-icxis pakklfvtfv rglalm 
-yvyhprrsyssrscvdwl*c 
mfitreeairhvrawigfdv 
18241 - tagagggctgtcatgcaactagagatgctgtgggtactaacctacctctccagctaggat - 18300 

-*RAVMQLEMLWVLTYLSS*D 
-RGLSCN*RCCGY*PTSPARI 
EG;CHATRDAVGTNLPLQLGF 
18301 - TTTCTAGAGGTGTTAACTTAGTAGCTGTACCGACTGGTTATGTTGACACTGAAAATAACA - 18360 

-flqvlt * * lyrlvmltlki t 
-fy!rc*lssctdwlc*h*k*h 
st gvnlvavptgyvdtennt 
18361 - cagaattcaccagagttaatgcaaaacctccaccaggtgaccagtttaaacatcttatac - 18420 

-QNSPELMQNLHQVTSLNILY 
-RI°HQS*CKTSTR*PV*TSYT 

ef'trvnakpppgdqfkhlip 
184 21 - cactcatgtataaaggcttgccctggaatgtagtgcgtattaagatagtacaaatgctca - 18 480 
-hscikacpgm*cvlr*ykcs 
- t h .v * rlalecsay* dstnaq 

LMYKGLPWNVVRIKIVQMLS 
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18481 - GTGATACACTGAAAGGATTGTCAGACAGAGTCGTGTTCGTCCETTGGGCGCATGGCTTTG - 18540 

- V I H * KDCQTESCSSFGRMAL 

- * YTERIVRQSRVRPLGAWL* 

DTLKGLS DRVVFVLWAHGFE 
18541 - AGCTTACATCAATGAAGTACTTTGTCAAGATTGGACCTGAAAGAACGTGTTGTCTGTGTG - 18600 
-SLHQ-STLSRLDLKERVVCV 

- AYINEVLCQDWT*KNVLSV* 

LTSMKYFVKIGPERTCCLCD 
18601 - ACAAACGTGCAACTTGCTTTTCTACTTCATCAGATACTTATGCCTGCTGGAATCATTCTG - 18660 
-TNVQLAFLLHQILMPAGI IL 
-QTCNLLFY FIRYLCLLESFC 
KRATCFSTS SDTYACWNHSV 
18661 - TGGGTTTTGACTATGTCTATAACCCATTTATGATTGATGTTCAGCAGTGGGGCTTTACGG - 18720 

- w v l t m s ithl*lmfssgalr 

- g f ; * lcl * p i y d * c savg l yg 

gf dyvynpfmidvqqvjgftg 
18721 - gtaaccttcagagtaaccatgaccaacattgccaggtacatggaaatgcacatgtggcta - 18780 
-vtfrvtmtn xarymemhmwl 

- * p . s e * p * ptlpgtwkctcg* 

nl'qsnhdqhcqvhgnahvas 
18781 - gttgtgatgctatcatgactagatgtttagcagtccatgagtgctttgttaagcgcgttg - 18840 

- V V W L S * LDV * QSMtfALLSAL 
-L*CYHD*MFSSP*VLC*AR* 

CDAIMTRCLAVHECFVKRVD 
18841 - ATTGGTCTGTTGAATACCCTATTATAGGAGATGAACTGAGGGTTAATTCTGCTTGCAGAA - 18900 
-IGLLNTLL*EMN*GLILLAE 
-LVC*IPYYRR*TEG*FCLQK 
WS VEYPI I GDELRVNSAGRK 
18 901 - AAGTACAACACATGGTTGTGAAGTCTGCATTGCTTGCTGATAAGTTTCCAGTTCTTCATG - 18960 
-KY NTWL * S LB CL L I S FQF FM 
-STTHGCEVCIAC**VSSSS* 
VQ HMVVKSALLADKFPVLHD 
18 961 - ACATTGGAAATCCAAAGGCTATCAAGTGTGTGCCTCAGGCTGAAGTAGAATGGAAGTTCT - 19020 
-TLEIQRLS SVCLRLK*NGSS 

- HW-KSKGYQVCASG* SRMEVL 

IGNPKAI KCVPQAEVEWKFY 
19021 - ACGATGGTCAGCCATGTAGTGACAAAGCTTACAAAATAGAGGAACTCTTCTATTCTTATG ~ 19080 
-TM^SHVVTKLTK^RNSSILM 

- R C S A M * * Q SLQNRGTLLFLC 

DAQPCSDKAYKIEELFYSYA 
19081 - CTACACATCACGATAAATTCACTGATGGTGTTTGTTTGTTTTGGAATTGTAACGTTGATC - 19140 

-lhitinslmv fvcfgivtli 
-yt!sr*ih*wclfvlel*r*s 
thhdkftdgvclfwncnvdr 
19141 - gttaccgagccaatgcaattgtgtgtaggtttgacacaagagtcttgtcaaacttgaact - 19200 
-vtqpmqlcvgltqbscqt* t 
-lpsqcncv*v*hkslvklel 
yp-anaivcrfdtrvlsnlnl 
19201 - taccag^ctgtgatggtggtagtttgtatgtgaataagcatgcattccacactccagctt - 19260 
-yqavmvvvcm* i smhstlql 

- t r l * ww * fvce * . a c i p hs s f 

pg cdggslyvnkhafhtpaf 

19261 - TCGATAAAAGTGCATTTACTAATTTAAAGCAATTGCCTTTCTTTTACTATTCTGATAGTC - 19320 
-S IKVHLLI * SNCLSFTILIV 

- R*KCIY*FKAIAFL>LIjF* * S 

DKSAFTNLKQLPFFYYS DSP 
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19321 - GTTGTGAGTCTCATGGCAAACAAGTAGTGTCGGATATTGATTATGTTCCACTCAAATCTG - 19380 
-LVSLMANK* CRILIMFHSNL 
-L*VSWQTSSVGY*LCSTQIC 
CESHGKQVV SDI DYVPLKSA 
19381 - CTACGTGTATTACACGATGCAATTTAGGTGGTGCTGTTTGCAGACACCATGCAAATGAGT - 19440 
-.LRVLHDAI *VVLFADTMQMS 
-YVYYTMQFRWCCLQTPCK*V 
TCITRCNLGGAVCRHHANEY 
19441 - ACCGACAGTACTTGGATGCATATAATATGATGATTTCTGCTGGATTTAGCCTATGGATTT - 19500 

- T DSTWMHI I * * FLLDLAYGF 

- PT VLGCI* YDDFCW I * PMDL 

RQYLDAYNMMISAGFSLWIY 
19501 - ACAAACAATTTGATACTTATAACCTGTGGAATACATTTACCAGGTTACAGAGTTTAGAAA - 19560 
-TNNLILITCGIHLPGYRV*K 
-QTI * YL*PVEYI YQVTEFRK 
KQFDTYNLWNTFTRLQSLE. N 
195 61 - ATGTGGCTTATAATGTTGTTAATAAAGGACACTTTGATGGACACGCCGGCGAAGCACCTG - 19 62 0 
-MWLIMLLIKDTLMDTPAKHL 
-C 'GL*CC**RTL*WTRRRSTC 
V A. YNVVNKGHFDGHAGEAPV 
19621 - TTTCCATCATTAATAATGCTGTTTACACAAAGGTAGATGGTATTGATGTGGAGATCTTTG - 19680 
-FPSLIMLFTQR*MVLMWRSL 

- F H H * *CCLHKGRWY*CGDL* 

S I. INNAVYTKVDGI DVEIFE 
19681 - AAAATAAGACAACACTTCCTGTTAATGTTGCATTTGAGCTTTGGGCTAAGCGTAACATTA - 19740 
-KIRQHFLLMLHLSFGLSVTL 
-K* DNTSC*CCI*ALG*A*H* 
NK ( TTLPVNVAFELWAKRNIK 
19741 - AACCAGTGCCAGAGATTAAGATACTCAATAATTTGGGTGTTGATATCGCTGCTAATACTG - 19800 
-NQCQRLRYSI IWVLI SLLIL 
-TS:ARD*DTQ*FGC*YRC*YC 
PVPEIKILNNLGVDIAANTV 
19801 - T A AT CT GG G ACT AC AAAAG AG AAGC CCC AGC AC ATG T ATC T ACAAT AG GT G TC TG CACAA - 19860 

- * S&TTKEKPQHMYLQ* VSAQ 

- N L GLQKRS PST C I YNRCLHN 

I W DYKREAPAHV S T IGVCTM 
19861 - TGACTGACATTGCCAAGAAACCTACTGAGAGTGCTTGTTCTTCACTTACTGTCTTGTTTG - 19920 
-* LTLPRNLLRVLVIiHLLSCL 
D * H C Q E T Y *ECliFFTYCLV* 
TD. IAKKPTESACSSLTVLFD 
19921 - ATGGTAGAGTGGAAGGACAGGTAGACCTTTTTAGAAACGCCCGTAATGGTGTTTTAATAA - 19980 
-MVEWKDR* TFLETPVMVF* * 

- W *SGRTGRPF*KRP*WCFNN 

G R. VEGQVDLFRNARNGVLIT 
19981 - CAGAAGGTTCAGTCAAAGGTCTAACACCTTCAAAGGGACCAGCACAAGCTAGCGTCAATG - 20040 
-QKVQSKV* HLQRDQ HKLASM 
-RRFSQRSNTFKGTSTS * R Q W 
EG-SVKGLTPSKGPAQASVNG 
2 0041 - GAGTCACATTAATTGGAGAATCAGTAAAAACACAGTTTAACTACTTTAAGAAAGTAGACG - 20100 
-ESH*LENQ*KHSLTTLRK*T 
-SHINWRISKNTV*LL*ESRR 
VT LI GESVKTQFNYF.KKVDG 
20101 - GCATTATTCAACAGTTGCCTGAAACCTACTTTACTCAGAGCAGAGACTTAGAGGATTTTA - 20160 
-ALFNS CLKPTLLRAET * R I L 
-HYSTVA*NLLYSEQRLRGF* 
I IQQLPETYFTQSRDLEDFK 
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2 0161 - AGCCCAGATCACAAATGGAAACTGACTTTCTCGAGCTCGCTATGGATGAATTCATACAGC - 20220 

- S P DHKWKLT F S SS LWMNSYS 
-AQITNGN*LSRARYG*IHTA 

PRSQMETDFLELAMDE F I Q R 
20221 - GATATAAGCTCGAGGGCTATGCCTTCGAACACATCGTTTATGGAGATTTCAGTCATGGAC - 20280 

- D I S S RAMPSNT SFMEI SVMD 

- I *-ARGLCLRTHRLWRFQSWT 

YKLEGYAFEHIVYGDFSHGQ 
20281 - AACTTGGCGGTCTTCATTTAATGATAGGCTTAGCCAAGCGCTCACAAGATTCACCACTTA - 20340 

- N L A V F I -k * * & * PSAHKIHHL 
-TW RSSFNDRLSQALTRFTT* 

LG'GLHLMIGLAKRSQDSPLK 
20341 - AATTAGAGGATTTTATCCCTATGGACAGCACAGTGAAAAATTACTTCATAACAGATGCGC - 20400 

- N * R I LSLWTAQ * K X T S * Q M R 

- IR GFYPYGQHSEKLLHNRCA 

LE. DFIPMDSTVKNYF1 TDAQ 
204 01 - AAACAGGTTCATCAAAATGTGTGTGTTCTGTGATTGATCTTTTACTTGATGACTTTGTCG - 20460 
-KQVHQNVCVL* LIFYLMTLS 
-NRFIKMCVFCD*SFT**LCR 
TG;S SKCVCSV I DLLLDDFVE 
204 61 - AGATAATAAAGTCACAAGATTTGTCAGTGATTTCAAAAGTGGTCAAGGTTACAATTGACT - 20520 

- R * * SHKICQ*FQKWSRLQLT 

DNKVTRFVS DFKS GQGYN* L 
I I KSQDLSVISKVVKVTIDY 
20521 - ATGCTGAAATTTCATTCATGCTTTGGTGTAAGGATGGACATGTTGAAACCTTCTACCCAA - 20580 
-MLKFHSCFGVRMDMLKPSTQ 
-C*;NFIHALV*GWTC*NIiliPK 
A E I S FMLWCKDG HVET FY PK 
20581 - AACTACAAGCAAGTCAAGCGTGGCAACCAGGTGTTGCGATGCCTAACTTGTACAAGATGC - 20640 
-NY KQ VKRGNQVLRC LT CTRC 

- T T " S K S SVATRCCDA* LVQDA 

LQASQAWQPGVAMPNLYKMQ 
20641 - AAAGAATGCTTCTTGAAAAGTGTGACCTTCAGAATTATGGTGAAAATGCTGTTATACCAA - 2 0700 
-KECFLKSVT FRI MVKMLLYQ 
-KN.AS*KV*PSELW*KCCYTK 
RM LLEKCDLQNYGENAVIPK 
20701 - AAGGAATAATGATGAATGTCGCAAAGTATACTCAACTGTGTCAATACTTAAATACACTTA - 20760 

- K E * * *MSQSILNCVNT*IHL 
-RNNDECRKVYSTVSI LKYTY 

GI MMNVAKYTQLCQYLNTLT 
20761 - CTTTAGCTGTACCCTACAACATGAGAGTTATTCACTTTGGTGCTGGCTCTGATAAAGGAG - 20820 

- L * XiY PTT* ELFTLVLALIKE 
-FSCTLQHESYSLWCVilL**RS 

LA VPYNMRVIHFGAGS DKGV 
20821 - TTGCACCAGGTACAGCTGTGCTCAGACAATGGTTGCCAACTGGCACACTACTTGTCGATT - 20880 
-LHQVQLCS DNGCQLAHYLS I 

- C T R Y SCAQTMVAMWHTTCRF 

AP GTAVLRQWLPTGTLLVDS 
20881 - CAGATCTTAATGACTTCGTCTCCGACGCAGATTCTACTTTAATTGGAGACTGTGCAACAG - 20940 
-QILMTSSPTQILL* LETVQQ 

- R S * * LRLRRRFYFNWRLCNS 

DL. N DFVSDADSTLI GDCATV 
20941 - TACATACGGCTAATAAATGGGACCTTATTATTAGCGATATGTATGACCCTAGGACCAAAC - 21000 
-YI&LINGTLLLAICMTLGPN 

- T Y G * *MGPYY*RYV*P*DQT 

H T ANKW DL I I S DMY D P RTKH 
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21001 - ATGTGAbAAAAGAGAATGACTCTAAAGAAGGGTTTTTCACTTATCTGTGTGGATTTATAA - 21060 
-M* QKRMTLKKGFSLI CVDL* 
-CD;KRE*L*RRVFHLSVWIYK 
VTiKENDSKEGFFTYLCGFIK 

21061 - AGCAAAAACTAGCCCTGGGTGGTTCTATAGCTGTAAAGATAACAGAGCATTCTTGGAATG ~ 21120 
-SK^*PWVVIi*L*R*QSILGM 
-AK;TSPGWFYSCKDNRAFLEC 

qk'lalggsiavkitehswna 

21121 - CTGACCTTTACAAGCTTATGGGCCATTTCTCATGGTGGACAGCTTTTGTTACAAATGTAA - 21180 

- L T jT T SLWAI SHGGQLLLQM* 

* P'LQAYGPFLMVDS FCYKC K 
DL'YKLMGHFSWWTAFVTNVN 
21181 - ATGCATCATCATCGGAAGCATTTTTAATTGGGGCTAACT2VTCTTGGCAAGCCGAAGGAAC - 21240 
-MHHHRKHF* LGLTI LASRRN 

- C I : I IGS IFNWG*LSWQAEGT 

ASSSEAFLIGANYLGKPKEQ 
21241 - AAATTGATGGCTATACCATGCATGCTAACTACATTTTCTGGAGGAACACAAATCCTATCC - 21300 
-KLMAIPCMLTTFSGGTQILS 
-N^^WLYHAC^LHFLEEHKSYP 
IDsGYTMHANYIFWRNTNPIQ 
21301 - AGTTGTCTTCCTATTCACTCTTTGACATGAGCAAATTTCCTCTTAAATTAAGAGGAACTG - 21360 

-scLpihslt*anflln*eel 

- VV : F1jFTL*HEQISS* ikrnc 

ls;syslfdmskfplklrgta 
21361 - ctgtaatgtctcttaaggagaatcaaatcaatgatatgatttattctcttctggaaaaag - 21420 

-L*CLLRRIKSMI*FILFWKK 
-CNiVS * GESNQ* Y D L F S S GKR 
VMiSLKENQIWDMI YS LLEKG 
21421 - GTAGGCtTATCATTAGAGAAAACAACAGAGTTGTGGTTTCAAGTGATATTCTTGTTAACA - 21480 
-VGLSLEKTTELW FQVIF LLT 
-*A,YH*RKQQSCGFK*YSC*Q 
RL;1 IRENNRVVVSSDILVNN 
21481 - ACTAAACGAACATGTTTATTTTCTTATTATTTCTTACTCTCACTAGTGGTAGTGACCTTG - 21540 
-TK^TCLFSYYFLLSLVVVTL 
~LN 'EHVYFLIISYSH*W**P* 
* T : N M F I FLLFLTLTSGS DLD 
21541 - ACCGGT$CACCACTTTTGATGATGTTCAAGCTCCTAATTACACTCAACATACTTCATCTA - 21600 
-TGAPIjLMMFKIiLITLNILHL 

- p v ;h h f * *csss*lhstyfiy 

rcittfddvqapnytqhtssm 
21601 - tgaggg^ggtttactatcctgatgaaatttttagatcagacactctttatttaactcagg ~ 21660 

- * ggfti lmkfldqtlfi * l r 
-eg;glls**nf*irhslfnsg 

rg:vyypdeifrsdtlyl.tqd 
21661 - atttatttcttccattttattctaatgttacagggtttcatactattaatcatacgtttg - 21720 
-i y ffhfi lmlqgfi ll i i r l 
f 1 '.3 s ilf*cyrvsyy* syvw 
lf : lpfysnvtgfhtinhtfg 
21721 - gcaacctitgtcataccttttaaggatggtatttattttgctgccacagagaaatcaaatg - 21780 
-atlsyllrmvfi li.pqrnqm 
-qp chtf*gwylfcchreikc 
np;vi pfkdgi yfaateksnv 
217 81 - ttgtccgtggttgggtttttggttctaccatgaacaacaagtcacagtcggtgattatta - 21840 
-lsvvgflvlp*-ttshsr*ll 

- cpwlgfwfyheqqvtvgdyv 

vr'gwvfgstmnnksqsviii 
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218 41 - TTAACAATTCTACTAATGTTGTTATACGAGCATGTAACTTTGAATTGTGTGACAACCCTT - 21900 
-LTILLMLLYEHVTZjNCVTTL 
-*QFY*CCYTSM*L*IV*QPF 
NN. STNVVIRACNFELCDNPF 
21901 - TCTTTGCTGTTTCTAAACCCATGGGTACACAGACACATACTATGATATTCGATAATGCAT - 21960 
-SLJjFLNPWVHRHI l * y s imh 
-LC : CF*THGYTDTYYDIR*CI 
FA ; VSKPMGTQTHTMIFDNAF 
21961 - TTAATTGCACTTTCGAGTACATATCTGATGCCTTTTCGCTTGATGTTTCAGAAAAGTCAG - 22020 
-LIALSSTYLMPFRLMFQKSQ 
- * L HFRVH I * C L F A * CFRKVR 
NC : TFEYI S DAFSLDVSEKSG 
22021 - GTAATTTTAAACACTTACGAGAGTTTGTGTTTAAAAATMAGATGGGTTTCTCTATGTTT - 22080 
-VILNTYESLCLKIKMGFSMF 
-*F*TLTRVCV*K*RWVSLCL 
NF:KHLRE FVFKN KDGFLYVY 
22081 - ATAAGGGCTATCAACCTATAGATGTAGTTCGTGATCTACCTTCTGGTTTTAACACTTTGA - 22140 
-IRAINL*M* FVIYLLVLTL* 
-*G:LSTYRCSS*STFWF*HFE 
KG : YQP I DVVRDL PSGFNTLK 
22141 - AACCTATTTTTAAGTTGCCTCTTGGTATT2^ACATTACAAATTTTAGAGCCATTCTTACAG - 22200 
-NLFLSCLLVLTLQILEPFLQ 
~ T Y ; F * VASWY * R Y K F * S H S YS 
PIFKLPLG I N I TNFRA I LTA 
22201 - CCTTTTCACCTGCTCAAGACATTTGGGGCACGTCAGCTGCAGCCTATTTTGTTGGCTATT - 22260 
-PFHLLKTFGARQLQPI L L A I 

-lf.tcsrhlghvscslfcwiif 
fs paqdiwgtsaaayfvgyl 
22261 - taaagccaactacatttatgctcaagtatgatgaaaatggtacaatcacagatgctgttg - 22320 
" * sqlhlcs smmkmvqsqmll 
-ka;nyiyaqv**kwynhrcc* 
kp:ttfmlkydengtitdavd 

22321 - ATTGTTCTCAAAATCCACTTGCTGAACTCAAATGCTCTGTTAAGAGCTTTGAGATTGACA - 22380 
-IVtKlHLLNSNALLRALRLT 
-LF SKSTC* TQMLC* EL* D * Q 
C S ; Q N P L A E LKCSVKS FE I DK 

22381 - AAGGAAtTTACCAGACCTCTAATTTCAGGGTTGTTCCCTCAGGAGATGTTGTGAGATTCC - 22440 
-KEFTRPLI SGLFPQEML*DS 
-RN : LPDL* FQGCSLRRCCEI P 
GliYQTSNFRVVPSGDVVRFP 

22441 - CTAATATTACAAACTTGTGTCCTTTTGGAGAGGTTTTTAATGCTACTAAATTCCCTTCTG - 22500 
-LliiQTCVLLERFLMLLNSLL 

-*y!yklvsfwrgf*cy* IPFC 
ni;tnlcpfgevfnatkfpsv 
22501 - tctatgdatgggagagaaaaaaaatttctaattgtgttgctgattactctgtgctctaca - 22560 
-smhgrekkflivll itlcst 

LC MGE KKN F * LCC * L L CALQ 
YAiWERKKI SNCVADYSVLYN 
22561 - ACTCAACATTTTTTTCAACCTTTAAGTGCTATGGCGTTTCTGCCACTAAGTTGAATGATC - 22620 

-tqhffqplsamaflpls*mi 
-ln;iffnl*vlwrfch*ve*s 

S T FFS T FKCYGV SATKLN DL 
22621 - TTTGCTTCTCCAATGTCTATGCAGATTCTTTTGTAGTCAAGGGAGATGATGTAAGACAAA - 22680 
-FASPMSMQ I LIi*SREMM* DK 
-LL:LQCLCRFFCSQGR*CKTN 

CFSNVYADSFVVKGDDVRQI 
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22681 



22741 



22801 



22861 



22921 



- 22800 



- 22860 



TAG CGCC AGG AC AAACTGGT GT T AT T GCT GATT AT A ATT AT AA ATT GC CAG ATG ATTT C A - 2274 0 
*RQDKLVLLLI IIINCQMIS 
SA'RTNWCYC * L * L * I A R * FH 
AP:GQTGVIADYNYKLPDDFM 
TGGGTTGTGTCCTTGCTTGGAATACTAGGAACATTGATGCTACTTCAACTGGTAATTATA 
WVVSLLGI LGTLMLLQLVII 
GL CPCLEY * EH * CYFNW * 1 * 
G C. VLAWNTRNI DATSTGNYN 
ATTATAAATATAGGTATCTTAGACATGGCAAGCTTAGGCCCTTTGAGAGAGACATATCTA 
IINIGILDMASLGPLRETYL 
L * - I * V S * T W Q A * A L * E R H I * 
YK- YRYLRHGKLRPFERDISN 
ATGTGCCTTTCTCCCCTGATGGCAAACCTTGCACCCCACCTGCTCTTAATTGTTATTGGC - 22 920 
MCLSPLMANLAPHLLLIVIG 
CA:FL P * ' W Q T LH P T CS * L L LA 
VP : FSPDGKPCTPPALNCYWP 
CATTAAATGATTATGGTTTTTACACCACTACTGGCATTGGCTACCAACCTTACAGAGTTG - 22 980 

- H * JVIIMVFT PLLALATNLTEL 
-IK*LWFLHHYWHWLPTLQSC 

L "N; D Y G F Y TTTGI GYQPYRVV 
22 981 - TAGTACtTTCTTTTGAACTTTTAAATGCACCGGCCACGGTTTGTGGACCAAAATTATCCA - 23040 
-*YFLLNF*MHRPRFVDQNYP 
STFF*TFKCTGHGLWTKI IH 
VLlSFELLNAPATVCGPKLST 
23041 - CTGACCTTATTAAGAACCAGTGTGTCAATTTTAATTTTAATGGACTCACTGGTACTGGTG - 23100 
-LTLLRTSVS I LI LMDS LVLV 

-*p ;y*epvcqf* f*wthwywc 
dl;i k n qcvn fn fnglt gtgv 
23101 - tgttaa9tccttcttcaaagagatttcaaccatttcaacaatttggccgtgatgtttctg - 23160 

- c * lllqrdfnhfnmlavmfl 
>vn : sffkeististiwp*cf* 

lt'psskrfqpfqqfgrdvsd 
23161 - atttcactgattccgttcgagatcctaaaacatctgaaatattagacatttcaccttgct - 23220 

-ISliIPFEI LKHLKY*TFHLA 

- f h > frsrs * n i * n irh ftll 

ft;dsvrdpktseildispcs 
23221 - cttttg(igggtgtaagtgtaattacacctggaacaaatgcttcatctgaagttgctgttc - 23280 
-llgv*v*lhleqmlhlkllf 

- f w -g c kcn y tw n kc f i * sccs 

fg.gvsvitpgtnassevavl 
23281 - tatatcaagatgttaactgcactgatgtttctacagcaattcatgcagatcaactcacac - 23340 

-YIKMLTALMFLQQFMQINSH 
IS;RC*LH* CFYSNSCRSTHT 
YQ-DVNCTDVSTAIHADQLTP 
23341 - CAGCTTGGCGCATATATTCTACTGGAAACAATGTATTCCAGACTCAAGCAGGCTGTCTTA - 23400 
-QLGAYILLETMYSRLKQAVL 

- S L 'A H I FYWKQCIPDSSRLSY 

A W R I YSTGNNVFQTQAGCLI 
23401 - TAGGAGCTGAGCATGTCGACACTTCTTATGAGTGCGACATTCCTATTGGAGCTGGCATTT - 23460 

- * EtSMSTLLMSATFLLELAF 

RS *ACRHFL * VRHSYW SV7HL 
GA EHVDTSYECDIPIGAGIC 
234 61 - GTGCTAQTTACCATACAGTTTCTTTATTACGTAGTACTAGCCAAAAATCTATTGTGGCTT - 23520 
-VLVTIQFLYYVVLAKNLLWL 
-C* : LPYSFFIT*Y*PKIYCGL 
ASYHTVSLLRSTSQKS IVAY 
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23521 - ATACTAfGTCTTTAGGTGCTGATAGTTCAATTGCTTACTCTAATAACACCATTGCTATAC - 23580 
-ILCL*VLIVQLLT31)IT P L L Y 
Y Y ' V F R C * * FNCLL * * HHCYT 
TM : SLGADSS IAYSNNTIAIP 
23581 - CTACTAACTTTTCAATTAGCATTACTACAGAAGTAATGCCTGTTTCTATGGCTAAAACCT - 23640 
-LLTFQLALLQK* CLFLWLKP 

- Y * ; L F N * HYYRSNACFYG*NL 

T N : F S ISITTEVMPVSMAKTS 
23641 - CCGTAG^TTGTAATATGTACATCTGCGGAGATTCTACTGAATGTGCTAATTTGCTTCTCC - 237 00 

- P * IVICTSAEILLNVLICFS 

- R R L * YVHLRRFY * M C * FASP 

VD CNMY ICGDSTECANLLLQ 
23701 - AATATGGTAGCTTTTGCACACAACTAAATCGTGCACTCTCAGGTATTGCTGCTGAACAGG - 23760 
-NMVAFAHN* IVHSQVLLLNR 

- I W>LLHTTKSCTLRYCC* T G 

YGSFCTQLNRALSGIAAEQD 
23761 - ATCGCAACACACGTGAAGTGTTCGCTCAAGTCAAACAAATGTACAAAACCCCAACTTTGA - 2382 0 
-IATHVKCSLKSNKCTKPQL* 

- S Q . H T * SVRSS Q TNVQN PN FE 

RN'TREVFAQVKQMYKTPTLK 
23821 - AATATTTTGGTGGTTTTAATTTTTCACAAATATTACCTGACCCTCTAAAGCCAACTAAGA - 23880 
-NILVV LIFHKYYLTL* SQLR 
-I F W F * F F T N I T * P S K A N * E 

YF^GG FNFSQI LP DPLKPTKR 
23881 - GGTCTTTTATTGAGGACTTGCTCTTTAATAAGGTGACACTCGCTGATGCTGGCTTCATGA - 23940 
-GLJjLRTCSLIR* hslmlas * 
-VF-Y*GLAL* *GDTR*CWLHE 
SF^IEDLLFNKVTLADAGFMK 

23 941 - AGCAATAtGGCGAATGCCTAGGTGATATTAATGCTAGAGATCTCATTTGTGCGCAGAAGT - 24 000 

-SNMANA*VILMLEISFVRRS 

- A I . W R M P R * Y* C * RSH lcaev 

QY GECLGDINARDLICAQKF 

24 001 - TCAATGQACTTACAGTGTTGCCACCTCTGCTCACTGATGATATGATTGCTGCCTACACTG - 24 060 

-SM DLQCCHLCSLMI * L L P T L 
QW TY SVAT SA H * *YDCCLHC 
NGLTVLPPLLTDDMIAAYTA 
24061 - CTGCTCTAGTTAGTGGTACTGCCACTGCTGGATGGACATTTGGTGCTGGCGCTGCTCTTC - 24120 

- L L * LVVLPLLDGHLVLALLF 
-CS:S*WYCHCWMDIWCWRCSS 

A L V S GTAT AGWT FGAGAALQ 
24121 - AAATACGTTTTGCTATGCAAATGGCATATAGGTTCAATGGCATTGGAGTTACCCAAAATG - 24180 
-KYLLLCKWH I GSMALELPKM 
-NT ; FCYANGI*VQWHWSYPKC 
I P;FAMQMAYRFNGIGVTQNV 
2 4181 - TTCTCTATGAGAACCAAAAACAAATCGCCAACCAATTTAACAAGGCGATTAGTCAAATTC - 24240 
-FSMRTKNKS PTNLTRRLVKF 

-sl*epktnrqpi*qgd*sns 
ly enqkqianqfnkaisqiq 
24241 - aagaatcacttacaacaacatcaactgcattgggcaagctgcaagacgttgttaaccaga - 24300 
-knhlqqhqlhwascktlltr 
-ri;tynnincigqaarrc*pe 
es.lt ttstalgklqdvvnqn 
24301 - atgctcaagcattaaacacacttgttaaacaacttagctctaattttggtgcaatttcaa - 24360 

- m l k h * thllnnlali lvqfq 
-cssikhtc*tt*l*fwcnfk 

aq.alntlvkqlssnfgaiss 
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2 43 61 - gtgtgc[caaatgatatcctttcgcgacttgataaagtcgaggcggaggtac2\aattgaca - 24420 

- v c * m i s frdliksrrryklt 

- c a k * ypfat" *srgggtn*q 

vl;ndilsrldkveaevqidr 

24421 - GGTTAATTACAGGCAGACTTCAAAGCCTTCAAACCTATGTAACACAACAACTAATCAGGG - 24480 

- G * IiQADFKAFKPM* H N N * SG 
-VNYRQTSKPSNLCNTTTNQG 

LI TGRLQSLQTYVTQQLIRA 
244 81 - CTGCTGAAATCAGGGCTTCTGCTAATCTTGCTGCTACTAAAATGTCTGAGTGTGTTCTTG - 24540 
-LLKSGLLLI LLLLKCLSVFL 
~C*NQGFC*SCCY*iSIV*VCSW 

ae:irasanlaatkmsecvlg 
24541 - gacaatcaaaaagagttgacttttgtggaaagggctaccaccttatgtcgttcccacaag - 24600 
-dnqkeltfverattlcpshk 
-tikks*llwkglppyvlpts 
qs- krvdfcgkgyhlmsfpqa 
24601 - cagccccgcatggtgttgtcttcctacatgtcacgtatgtgccatcccaggagaggaact - 24660 
-qp^mvls symsrmch prrgt 

- s p'awcclptchvcai PGEEL 

ap hgvvflhvtyvpsqernf 

24661 - TCACCACAGCGCCAGCAATTTGTCATGAAGGCAAAGCATACTTCCCTCGTGAAGGTGTTT - 24720 

- S P QRQQFVMKAKHT SLVKVF 
-HH'SASNLS*RQS ILPS*RCF 

TT APAICBEGKAYFPREGVF 
24721 - TTGTGTTTAATGGCACTTCTTGGTTTATTACACAGAGGAACTTCTTTTCTCCACAAATAA - 24780 
-LCfcMALIiGIiLHRGCSFLHK* 

-cv1 *whflvyyteeiilfstnn 
vfingtswfitqrnffspqii 
24781 - ttactapagacaatacatttgtctcaggaaattgtgatgtcgttattggcatcattaaca - 24840 

i hlsqe ivmsllaslt 
-yy 'rq y i c l r k l * crywhh * q 
tt : dntfvsgncdvvigiinn 
24841 - acacagtttatgatcctctgcaacctgagcttgactcattcaaagaagagctggacaagt - 24900 
-tqfm i lcnlslthskkswts 
h s l * ssat*a*li qrragqv 
tv ; ydplqpelds fkeeldky 
2 4 901 - acttcaaaaatcatacatcaccagatgttgatcttggcgacatttcaggcattaacgctt - 24 960 

- t s k i i hhqmlilatfqal tl 
-lq'ksyitrc*swrhfrh*rf 

fk'nhtspdvdlgdi sginas 
24 961 - ctgtcgtcaacattcaaa&agaaattgaccgcctcaatgaggtcgctaaaaatttaaatg - 25020 

-LSSTFKKKLTASMRSLKI * M 

- CR;QHSKRN*PPQ*GR*KFK* 

VV NIQKEIDRLNEVAKNLNE 
25021 - AATCACTCATTGACCTTCAAGAATTGGGAAAATATGAGCAATATATTAAATGGCCTTGGT - 25080 
-NHSLT FKNWENMSN I LNGLG 

- I T H * PSRIGKI*AIY*MALV 

SLI DLQELGKYEQYIKWPWY 
25081 - ATGTTTGGCTCGGCTTCATTGCTGGACTAATTGCCATCGTCATGGTTACAATCTTGCTTT - 25140 
-MFGSASLLD*LPSSWLQSCF 
-CL ARLHCWTNCHRHGYNLAL 
VW.LGFIAGLIAIVMVTILLC 
25141 - GXTGCATGACTAGTTGTTGCAGTTGCCTCAAGGGTGCATGCTCTTGTGGTTCTTGCTGCA - 25200 

- V A * LVVAVASRVHALVVLAA 

- L H D * LLQLPQGCMLLW FLLQ 

cm!tsccsclkgacscgscck 
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252 01 - AGTTTGATGAGGATGACTCTGAGCCAGTTCTCAAGGGTGTCAAATTACATTACACATAAA - 252 60 
-SLMRMTLSQ FSRVSNY I THK 
-V**G*L*ASSQGCQITLHIN 
FDEDDSE PVLKGVKLHYT * T 
252 61 - CGAACTTATGGATTTGTTTATGAGATTTTTTACTCTTGGATCAATTACTGCACAGCCAGT - 25320 
-RTYGFVYEI FYSWINYCTAS 
-ELMDIiFMRFFTLGSI TAQPV 
NLWICL* DFLLLDQLLHSQ* 
25321 - AAAAATTGACAATGCTTCTCCTGCAAGTACTGTTCATGCTACAGCAACGATACCGCTACA - 25380 

- K N *QCFSCKYCSCYSNDTAT 
-KIDNASPASTVHATATI PLQ 

KLTMLLLQVLFMLQQRYRYK 
25381 - AGCCTCACTCCCTTTCGGATGGCTTGTTATTGGCGTTGCATTTCTTGCTGTTTTTCAGAG - 25440 
-SLTPFRMACYWRCISCCFSE 
-ASLPFGWLVIGVAFLAVFQS 
PHSLSDGLLLALHFLLFFRA 
25441 - CGCTACCAAAATAATTGCGCTCAATAAAAGATGGCAGCTAGCCCTTTATAAGGGCTTCCA - 25500 
-RYQNNCAQ * KMAAS PL *GLP 

- A T K I IALNKRWQLALYKGFQ 

LP'K*LRSIKDGS*PFIRASS 
25501 - GTTCATTTGCAATTTACTGCTGCTATTTGTTACCATCTATTCACATCTTTTGCTTGTCGC - 25560 
-VHkQFTAAI CYHLFTS FACR 

- FICNLLLLFVTIYSHLLLVA 

SFAIYCCYLLPSIHIFCLSL 
25561 - TGCAGGTAAGGAGGCGCAATTTTTGTACCTCTATGCCTTGATATATTTTCTACAATGCAT - 25620 
~CR*GGAI FVPLCLDIFSTMH 
-AGKEAQFLYLYALIYFLQCI 
QV RRRN FCTSMP * Y I FYNAS 

25621 - caacgcatgtagaattattatgagatgttggctttgttggaagtgcaaatccaagaaccc - 25680 
-qrm*nyyemlall,evq iqep 

- nacriimrcwlcwkcksknp 

th vell* dvgfvgsanprth 
25681 - attactttatgatgccaactactttgttrgctggcacacacataactatgactactgtat - 25740 

- i t l * cqllcllaht* l * lly 
-llydanyfvcwhthnydyci 

yf mmpttlfagthitmttvy 
25741 - accatataacagtgtcacagatacaattgtcgttactgaaggtgacggcatttcaacacc - 258 00 
-ti*qcrryncry*r*rhfnt 
pynsvt dtivvtegdgistp 
hi'tvsqiqlsllkvtafqhq 
25801 - aaaactcaaagaagactaccaaattggtggttattctgaggataggcactcaggtgttaa - 25860 

-KTQRRLPNWWLF*G*ALRC* 
-KL .KE DYQ IGGYSEDRHSGVK 
NS.KKTTKLVVILRIGTQVLK 
258 61 - AGACTATGTCGTTGTACATGGCTATTTCACCGAAGTTTACTACCAGCTTGAGTCTACACA - 25920 
-RLCRCTWLFHRSLLPA*VYT 
-DYVVVHGYFTEVYYQLESTQ 
TMSLYMAISPKFTTSLSLHK 
25921 - AATTACTACAGACACTGGTATTGAAAATGCTACATTCTTCATCTTTAACAAGCTTGTTAA - 25980 
-NYYRHWY* KCYILHL*QAC* 
I T TDTG I ENAT FFI FNKLVK 
LLQTLVLKMLHSSSLTSLLK 
25931 - AGACCCACCGAATGTGCAAATACACACAATCGACGGCTCTTCAGGAGTTGCTAATCCAGC - 2 6040 
-R PTE CANT HNRRLFRSC* S S 
-DPPNVQIHTIDGSSGVANPA 
THRMCKYTQSTALQELLIQQ 
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26041 - AATGGATCCAATTTATGATGAGCCGACGACGACTACTAGCGTGCCTTTGTAAGCACAAGA - 26100 
-NGSNL* *ADDDY*RAFVSTR 
-MDPIYDEPTTTTSVPL*AQE 
WIQFMMSRRRLLACLCKHKK 
2 6101 - AAGTGAGTACGAACTTATGTACTCATTCGT1TCGGAAGAAACAGGTACGTTAATAGTTAA - 26160 
-K*VRTYVLI RFGRNRYVNS* 
-SE YELMYSFVSEETGTLIVN 
VS TNLCT HS FRKKQVR* * L I 
2 6161 - TAGCGTACTTCTTTTTCTTGCTTTCGTGGTATTCTTGCTAGTCACACTAGCCATCCTTAC - 26220 

- * RTS FSCFRGI LASHTSHPY 

- SVLLFLAFVVFLLVTLAILT 

AY FFFLLSWYSC* S H * PSLL 
26221 - TGCGCTTCGATTGTGTGCGTACTGCTGCAATATTGTTAACGTGAGTTTAGTAAAACCAAC - 26280 
-CAS IVCVLLQYC *REFSKTN 
-ALRLCAYCCNIVNVSLVKPT 
RF.DCVRT A A I LLT * V * * NQR 
26281 - GGTTTACGTCTACTCGCGTGTTAAAAATCTGAACTCTTCTGAAGGAGTTCCTGATCTTCT - 26340 
-GLRLLAC*KSELF*RSS*SS 
-VYVYSRVKNLNSSEGVPDLL 
FTSTRVLKI*TLLKBFLIF 1 W 
26341 - GGTCTAAACGAACTAACTATTATTATTATTCTGTTTGGAACTTTAACATTGCTTATCATG - 26400 
-GLNELTI I I ILFGTLTLLIM 
«V*T1S1^LLXLFCLEL*HCLSW 
SK RTNYYYYSVWNFN IAYHG 
2 6401 - GCAGACAACGGTACTATTACCGTTGAGGAGCTTAAACAACTCCTGGAACAATGGAACCTA - 26460 
-ADNGTITVEELKQLLEQWNL 
-QTTVLLPLRSLNNSWNNGT* 
RQ RYYYR * G A * TTPGTMEPS 
26461 - GTAATAGGTTTCCTATTCCTAGCCTGGATTATGTTACTACAATTTGCCTATTCTAATCGG - 26520 
-VIGFLFLAW IMLLQFAYS NR 

- + *VSYS*PGLCYYNLPI1jIG 

NRFPIPSL DYVTTICLF^SE 
26521 - AACAGGTTTTTGTACATAATAAAGCTTGTTTTCCTCTGGCTCTTGTGGCCAGTAACACTT - 2 6580 

- N R F L Y I IKLVFLWLLWPVTL 

- T G F C T * * SLFSSGSCGQ*HL 

QV FVHN KACFPLALVASNTC 
26581 - GCTTGTTTTGTGCTTGCTGTTGTCTACAGAATTAATTGGGTGACTGGCGGGATTGCGATT - 26640 
-ACFVLAVVYRINWVTGGIAI 
-LV.LCLLLSTELIG* LAGLRL 
LF CACCCLQN* LGDWRDCDC 
2 6 641 - GCAATGQCTTGTATTGTAGGCTTGATGTGGCTTAGCTACTTCGTTGGTTCCTTCAGGCTG - 26700 
-AMACIVGLMWLSYFVASFRL 
~QWLVL*A* CGLATSLLPSGC 
NGLYCRLDVA*LLRCFLQAV 
26701 - TTTGCTQGTACCCGCTCAATGTGGTCATTCAACCCAGAAACAAACATTCTTCTCAATGTG - 26760 
-FARTRSMW S FN PE TN I LLNV 
-LLVPAQCGHSTQKQTFFSMC 
CS YPLNVVIQPRNKHSSQCA 
2 67 61 - CCTCTCCGGGGGACAATTGTGACCAGACCGCTCATGGAAAGTGAACTTGTCATTGGTGCT - 2 6820 

- P LRGT IVTRPLMESELVI GA 
-ls'ggql* PDRSWKVNLSLVL 

SP.GDNCDQTAHGK*TCHWCC 
26821 - GTGATCATTCGTGGTCACTTGCGAATGGCCGGACACTCCCTAGGGCGCTGTGACATTAAG - 26880 
-V I IRGHLRMAGHSLGRCDIK 
-*SFVVTCEWPDT?*GAVTLR 

DHSWSLANGRTLPRAL*H*G 

FIG, 11 Con't 



WO 2004/085455 



PCT/CN2004/000247 



51/94 

26881 - gacctgccaaaagagatcactgtggctacatcacgaacgctttcttattacaaattagga - 26940 
-dlpkeitvatsrtlsyyklg 
-tc:qkrslwlhherflitn*e 
pa'krdhcgyitnafllqirs 

2 6941 - gcgtcgcagcgtgtaggcactgattcaggttttgctgcatacaaccgctaccgtattgga - 27000 
-asqrvgtdsgfaaynryrig 

-RRSV*ALIQVLLHTTATVLE 

V A A C R H * FRFCCIQPLPYWK 

27001 - AACTATAAATTAAATACAGACCACGCCGGTAGCAACGACAATATTGCTTTGCTAGTACAG - 27060 
-NYKLNTDHAGSN DNIALLVQ 

- T I N * IQTTPVATTILLC* YS 

L*. IKYRPRR* QRQYCFASTV 
27061 - TAAGTGACAACAGATGTTTCATCTTGTTGACTTCCAGGTTACAATAGCAGAGATATTGAT - 27120 
-*VTTDVSSC*LPGYNSRDID 
-K* : QQMFHLVDFQVTIAE I hi 
SD NRCFILLTSRLQ* Q R Y * L 
27121 - TATCATTATGAGGACTTTCAGGATTGCTATTTGGAATCTTGACGTTATAATAAGTTCAAT - 27180 
-YHYE DFQDCYLE S * RYNKFN 
-II MRTFRIAIWNLDVIISSI 
S L * GLSGLLFGILTL* * V Q * 
27181 - AG TG AG AC AAT T ATT T AAGCC TC T AAC T AAG A AG A AT T AT TC GG AG T T AG ATG AT GAAG A - 27240 
-SETII*ASN±EELFGVR**R 
-VR;QLFKPLTKKNYSELDDEE 
* D ? N Y L S L * la R R I IRS*MMKN 
27241 - ACCTATGGAGTTAGATTATCCATAAAACGAACATGAAAATTATTCTCTTCCTGACATTGA - 27300 
-TYGVRLS IKRT*KLFSS*H* 
-PM-BIiDYP*NEHENYSIiPDID 
LW:S * I IHKTNMKI ILFLTLI 
27301 - TTGTATTTACATCTTGCGAGCTATATCACTATCAGGAGTGTGTTAGAGGTACGACTGTAC - 27360 
-LYiHLASYITIRSVLEVRLY 
-CI YILRAISliSGVC*RYDCT 

V F T SCELYHYQECVRGTTVL 

27361 - TACTAAAAGAACCTTGCCCATCAGGAACATACGAGGGCAA.TTCACCATTTCACCCTCTTG - 27420 
-Y*KNLAHQEHTRAIHHFTLL 
TK.RTLPIRN IRGQFTI SPSC 
LK;EPCPSGTYEGNSPFHPIiA 
27421 - CTGACAATAAATTTGCACTAACTTGCACTAGCACACACTTTGCTTTTGCTTGTGCTGACG - 27480 
-LTINLH*LALAHTLLLLVI ) T 

- * Q : * I C T N L H * HTLCFCIjC*R 

DN;K FALTC T S T H FAFACADG 
27481 - GTACTCGACATACCTATCAGCTGCGTGCAAGATCAGTTTCACCAAAACTTTTCATCAGAC - 27540 
* - VLDI PI SCVQDQFHQNFSSD 

- YSTYLSAACKI S FTKT FH QT 

TRHTYQLRARSVSPKLFIRQ 
27541 - AAGAGGAGGTTCAACAAGAGCTCTACTCGCCACTTTTTCTCATTGTTGCTGCTCTAGTAT - 27600 
-KRRFNKSSTRHFFSLLLL*.Y 

- RG : GSTRALLATFSKCCCSSI 

EE-VQQELYSPLFLIVAALVF 
27601 - TTTTAAfeCTTTGCTTCACCATTAAGAGAAAGACAGAATGAATGAGCTCACTTTAATTGA - 27660 

- f * yfasplrerqts)e*ahfn* 
-fn:tllhh*ekdrmneltlid 

li:lcftikrkte*mssl*lt 

27661 - CTTCTATTTGTGCTTTTTAGCCTTTCTGCTATTCCTTGTTTTAATAATGCTTATTATATT - 27720 
-LLFVLFSLSAI PCFNNAYYI 
-FYLCFLAFLLFLVLIMLIIF 
S I C A F * PFCYSLF* * C L L Y F 
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27721 - TTGGTTTTCACTCGAAATCCAGGATCTAGAAGAACCTTGTACCAAAGTCTAAACGAACAT - 27780 
-LVFTRNPGSRRTLYQSLNEH 
-WFSLEIQDLEEPCTKV*TNM 
G F H S KSRI * K N L V P KS KRT * 
277 81 - GAAACTTCTCATTGTTTTGACTTGTATTTCTCTATGCAGTTGCATATGCACTGTAGTACA - 27 840 
-ETSH CFDLY FSMQLHMHCS T 
-KLLIVLTCISLCSCICTVVQ 
NFSLF*LVFLYAVAYAL*YS 
27841 - GCGCTGTGCATCTAATAAACCTCATGTGCTTGAAGATCCTTGTAAGGTACAACACTAGGG - 27900 

- A L C I * *TSCA*RSL* GTTLG 
-RC ASNKPHVLE DPCKVQH* G 

AVHLINLMCLKI LVRYNTRG 

27 901 - GTAATACTTATAGCACTGCTTGGCTTTGTGCTCTAGGAAAGGTTTTACCTTTTCATAGAT - 27 960 

-VIJiIAI»LGFVL*ERFYLFID 
-*YL*HCXiALCSRKGFTFS*M 
NTYSTAWLCALGKVLPFHRW 
27961 - GGCACACTATGGTTCAAACATGCACACCTAATGTTACTATCAACTGTCAAGATCCAGCTG - 28020 

- G T lr W FKHAHLMLLSTVKIQL 
-AHYGSNMHT*CYYQLSRSSW 

HTMVQTCTPNVT INCQDPAG 

28 021 ~ GTGGTGCGCTTATAGCTAGGTGTTGGTACCTTCATGAAGGTCACCAAACTGCTGCATTTA - 28080 

-VVRL * LGVGT FMKVTKLLHL 
-WCAY3*VLVPS*RSPNCCI* 
GALIARCWYLHEGHQTAAFR 
2 8 081 - GAGACGTACTTGTTGTTTTAAATAAACGAACAAATTAAAATGTCTGATAATGGACCCCAA - 28140 

-etyllf* ineqikms dngpq 
-rrtccfk*tnklkclimdpn 
dv!lvvlnkrtn*nv**wtpi 
28141 - tcaaaccaacgtagtgccccccgcattacatttggtggacccacagattcaactgagaat - 28200 
-snqrsapritfggft dstdn 

- qt nvvp palhlvdpqi q l t i 

kft*cpphyiwwthrfn*q* 

28201 - AACCAGAATGGAGGACGCAATGGGGCAAGGCCAAAACAGCGCCGACCCCAAGGTTTACCC - 282 60 
-NQNGGRNGARPKQRRPQGLP 
TRME DAMGQGQNSADPKVYP 
PEWRTQWGKAKTAPTPRFTQ 
282 61 - AATAATACTGCGTCTTGGTTCACAGCTCTCACTCAGCATGGCAAGGAGGAACTTAGATTC - 2 8320 
-NNTASWFTALTQHOKEELRF 

- IILRLGSQLSLSMARRNLDS 

*YCVLVHSSHSAWQGGT*IP 
28321 - CCTCGAGGCCAGGGCGTTCCAATCAACACCAATAGTGGTCCAGATGACCAAATTGGCTAC - 28380 
-PRGQGVPINTNSGPDDQIGY 
-LEARAFQSTPIVVQMTKLAT 
S R P GRSNQHQ * W S R * PNWLL 
28381 - TACCGAAGAGCTACCCGACGAGTTCGTGGTGGTGACGGCAAAATGAAAGAGCTCAGCCCC - 28440 
-YRRATRRVRGGDGKMKELSP 
-TE -ELPDEFVVVTAK* KSSAP 
PK.SYPTSSWW*RQNERAQPQ 
28441 - AGATGGTACTTCTATTACCTAGGAACTGGCCCAGAAGCTTCACTTCCCTACGGCGCTAAC - 28500 
-RWYFYYLGTGPEASLPYGAN 
-DGTS IT*ELAQKLHFPTAIiT 
MVLLLPRNWPRSFTSLRR*Q 
28501 - AAAGAAGGCATCGTATGGGTTGCAACTGAGGGAGCCTTGAATACACCCAAAGACCACATT - 28560 

- K E G I VWVATEGALNT PKDHI 

- K K A S YGLQLREP *IHPKTTL 

RRHRMGCN*GSLEYTQRPHW 
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28561 - GGCACCCGCAATCCTAATAACAATGCTGCCACCGTGGTACAACTTCCTCAAGGZUVCAACA - 28620 
-GTRN PNNNAATVLQLPQGTT 

- A P A I L I TMLPPCYNFLKEQH 

H P Q S * *QCCHRATTSSRNNI 
28621 - TTGCCAAAAGGCTTCTACGCAGAGGGAAGCAGAGGCGGCAGTCAAGCCTCTTCTCGCTCC - 28 680 
-LPKG FYAEGSRGGSQAS SRS 
-CQ KASTQREAEAAVKPLLAP 
AKRLLRRGKQRRQSSLFSLL 
28 681 - TCATCACGTAGTCGCGGTAATTCAAGAAATTCAACTCCTGGCAGCAGTAGGGGAAATTCT - 28740 
-SSRSRGN5RNSTPGSSRGNS 
-HHVVAVIQEIQLLAAVGEIL 
IT-*SR*FKKFNSWQQ*6KPS 
28741 - CCTGCTCGAATGGCTAGCGGAGGTGGTGAAACTGCCCTCGCGCTATTGCTGCTAGACAGA - 28800 
-PARMAS GGGETALALL LL DR 
-LLEWLAEVVKLPSRYCC* T D 
CS N G * RRW * N CP RAI A A R Q 1 
28 801 - TTGAACCAGCTTGAGAGCAAAGTTTCTGGTAAAGGCCAACAACAACAAGGCCAAACTGTC - 28 8 60 
-LNQLESKVSGKGQQQQGQTV 
* TSLRAKFLVKAWNNKAKLS 
EP:A*EQSFW* RPTTTRPNCH 
28 861 - ACTAAGAAATCTGCTGCTGAGGCATCTAAAAAGCCTCGCCAAAAACGTACTGCCACAAAA - 28920 
-TKKSAAEAS KKPRQKRTATK 
-LR : NLIj1iRKLKSLAKNVLPQN 

*e'icc*gi*kaspktychkt 
28 921 - cagtacaacgtcactcaagcatttgggagacgtggtccagaacaaacccaaggaaatttc - 28980 
-qynvtqafgrrgpeqtqgnf 
-sttslkhlgdvvqnkpkeis 
vqrhss iwetwsrtnprkfr 
28981 - ggggaccaagacctaatcagacaaggaactgattacaaacattggccgcaaattgcacaa - 29040 
-gdqdlirqgtdykkw pq iaq 
-gt *kt*sdkelitnigrklh1s) 
gprpnqtrn*lqtlaancti 
29041 - tttgctccaagtgcctctgcattctttggaatgtcacgcattggcatggaagtcacacct - 29100 
-fapsasaffgmsrigmevtp 
-llqvplrslechalawkshl 
cs-kclcilwnvthwhgshtf 
29101 - tcgggaacatggctgacttatcatggagccattaaattggatgacaaagatccacaattc - 29160 
-sgtwltyhgaiklddkdpqf 
-re : hg*limeplnwmtkihns 
gnmadlswsh* ig*qrstiq 
29161 - aaagacaacgtcatactgctgaacaagcacattgacgcatacaaaacattcccaccaaca - 29220 
-kdnv illwkhi dayktfp pt 
-kt tsyc* tstlthtkhshqq 
rq rh ta eqah * r i qn i ptnr 
29221 - gagcctaaaaaggacaaaaagaaaaagactgatgaagctcagcctttgccgcagagacaa - 29280 

-EPKKDKKKKTDEAQPLPQRQ 
-SLKRTKRKRLMKLSLCRRDK 
A*.KGQKEKD** SSAFAAETK 
29281 - AAGAAGCAGCCCACTGTGACTCTTCTTCCTGCGGCTGACATGGATGATTTCTCCAGACAA - 29340 

- K K Q PTVTLLPAADMDDFSRQ 
-RSSPL*LFFLRLTWMISPDN 

EAAHCDSSSCG*HG* FLQTT 
29341 - CTTCAAAATTCCATGAGTGGAGCTTCTGCTGATTCAACTCAGGCATAAACACTCATGATG - 29400 
-LQNSMSGASADSTQA* TLMM 

- F K I P*VELLLIQLRHKHS* * 

SKFHEWSFC*FNSGINTHDD 
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29401 - ACCACACAAGGCAGATGGGCTATGTAAACGTTTTCGCAATTCCGTTTACGATACATAGTC - 29460 
-TTQGRWAM * TFSQFRLRYIV 

- P H KADGLCKRFRN SVY DT * S 

HTlRQMGYVNVFA I P FT I H S L 
294 61 - TACTCTTGTGCAGAATGAATTCTCGTAACTAAACAGCACAAGTAGGTTTAGTTAACTTTA - 29520 

- Y S C A E * ILVTKQHK*V*I>TI> 
~TL : VQNEFS *LNSTSRFS * L * 

LL^CRMN S RH * TAQV GLVN FN 
29521 - ATCTCACATAGCAATCTTTAATCAATGTGTAACATTAGGGAGGACTTGAAAGAGCCACCA - 29580 
-ISHSNL* SMCNIREDLKEPP 
-SHIAIFNQCVTLGRT*KSHH 
LT:*QSLINV*H*GGLERATT 
29581 - CATTTTCATCGAGGCCACGCGGAGTACGATCGAGGGTACAGTGAATAATGCTAGGGAGAG - 29640 
-HFHRGHAEYDRGYSE* C * G B 
-IF ; IEATRSTIEGTVNNARES 
FSSRPRGVRSRVQ* IMLGRA 
29641 - CTGCCTATATGGAAGAGCCCTAATGTGTAAAATTAATTTTAGTAGTGCTATCCCCATGTG - 29700 
-IiPjWKSPNV*N*F**CYPHV 

-cl'ygralmckinfssaipm* 

AY : MEEP*CVKLILVVLSPCD 
2 9701 - ATTTTAATAGCTTCTTAGGAGAATGACAAAAAAAAAAAAAAA - 29742 

- I L 1 A S * EN DKKKKKX 

- jr * * LLRRM TKKKKX 

FN S FLGE * QKKKKX 
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1 - TTTTTTTTTTTTTTTGTCATTCTCCTAAGAAGCTATTAAiVATCACATGGGGATAGCACTA - 60 
-FFFFFVI LLRSY*NHMGIAL 

-ff : ffiisfs*eaikitwg*hy 
f f, ffchspkkllkshgdstt 
61 - ctaaaattaattttacacattagggctcttccatataggcagctctccctagcattattc - 120 
-lklilhi ralpyrqlslalf 
* n * fytlglfhi gss p * hys 
kinfth*gssi*aalpsiih 
121 - actgtaccctcgatcgtactccgcgtggcctcgatgaamtgtggtggctctttcaagtc - 180 

- t v £ s ivlrvasmkmwwlfqv 

- ly prsy saw p r * kcg gs fks 

ct ldrt prgldenvvals sp 
181 - ctccctaatgttacacattgattaaagattgctatgtgagattaaagttaactaaaccta - 24 0 
-lpfjvth*lkiam*d*s*lnl 
s l mlhi d * rllce ikvn * ty 
p*. cytli kdcyvrlkltkpt 
241 - cttgtgctgtttagttacgagaattcattctgcacaagagtagactatgtatcgtaaacg - 300 
-lvlfsyens fctrvdyvs * t 

- lc'clvtri h saqe * tmyrkr 

ca v^lrefilhksrlcivng 
301 - gaattgcgaaaacgtttacatagccgatctgcgttgtgtggtcatcatgagtgtttatgc - 360 
-elrkrlh s p salcghheclc 
-nc'.envyiahlpcvvimsvya 
ia;ktft*piclvwss*vfmp 
361 - ctgagttgaatcagcagaagctccactcatggaattttgaagttgtctggagaaatcatc - 420 
-ls*isrssthgilklsgeiii 
~*v-esaeaplmef*sclekss 
bl : nqqklhswnfevvwrnhp 

421 - CATGTCAGCCGCAGGAAGAAGAGTCACAGTGGGCTGCTTCTTTTGTCTCTGCGGCAAAGG - 4 80 
-HVSRRKKSHSGLLLLSLRQR 
-MS AAGRRVT VGCFFCLCGKG 
CQ PQEEESQWAASFVSAAKA 

481 - CTGAGCTTCATCAGTCTTTTTCTTTTTGTCCTTTTTAGGCTCTGTTGGTGGGAATGTTTT - 54 0 

- Ii S F I SLFLFVLFRIjCWWECF 

*A"SSVFFFLSFLGSVGGNVL 
ELjHQSFS FCP F*ALLVGMFC 
541 - GTATGCGTCAATGTGCTTGTTCAGCAGTATGACGTTGTCTTTGAATTGTGGATCTTTGTC - 600 
-VCVNVLVQQYDVVFELWl FV 
-YA'SMCLFSSMTLSLNCGSLS 
MR QCACSAV*RCL*IVDLCH 
601 - ATCCAATTTAATGGCTCCATGATAAGTCAGCCATGTTCCCGAAGGTGTGACTTCCATGCC - 660 
-IQFNGSMISQPCSRRCDFHA 
-SN -LMAP* *VSHVPEGVTSMP 
PIj*WLHDKSAMFPKV*LPCQ 
661 - AATGCGTGACATTCCAAAGAATGCAGAGGCACTTGGAGCAAATTGTGCAATTTGCGGCCA - 720 
-NA*HSKECRGTWSKLCNLRP 
-MRDI PKNAEALGANCAICGQ 
C V;T FQRM QRHLE Q I V Q FAAN 

721 - atgtttgtaatcagttccttgtctgattaggtcttggtccccgaaatttccttgggtttg - 780 
~mfyisslsd*vlvpeislgl 
-cl;*svpclirswspkfpwvc 
v c nqflv * lglg p rn flg fv 

781 - ttctggaccacgtctcccaaatgcttgagtgacgttgtactgttttgtggcagtacgttt - 8 40 
-fwttspkclsdvvlfcgstf 

- s g'prlpna * vt l y c fvavrf 

ld hvsqmle'-rctvlwqyvf 
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8 41 - TTGGCGAGGCTTTTTAGATGCCTCAGCAGCAGATTTCTTAGTGACAGTTTGGCCTTGTTG - 900 

-LARLFRCkS srfls dslall 

-WR-GFLDASAADFLVTVWPCC 
GEAF*MPQQQIS * * QFGLVV 
901 - TTGTTGGCCTTTACCAGAAACTTTGCTCTCAAGCTGGTTCAATCTGTCTAGCAGCAATAG - 960 
-IiLAFTRNFALKLVQSV*QQ* 
-CWPLPETLLSSWFNLSSSNS 
VGLYQKLCSQAGSI C L A A I A 
961 - CGCGAGGGCAGTTTCACCACCTCCGCTAGCCATTCGAGCAGGAGAATTTCCCCTACTGCT - 1020 
-REGSFTTSASHSSRRI SPTA 

- A R '■ A V S P P P LA I RAGE FP LLL 

RG-QFHHLR*PFEQENFP*YCC 
1021 - GCCAGGAGTTGAATTTCTTGAATTACCGCGACTACGTGATGAGGAGCGAGAAGAGGCTTG - 1080 

- A R S * IS* I T A T T * * G A R R G L 
-PG VEFLELPRLRDEEREEA* 

QELNFLNYRDYVMRSEKRLD 
1081 - ACTGCCGCCTCTGCTTCCCTCTGCGTAGAAGCCTTTTGGCAATGTTGTTCCTTGAGGAAG - 1140 
-TAASASLCVEAFWQCCSLRK 
-LP : P1>LPSA*KPFGNVVP*GS 
CR-LCFPLRRSLLAMLFLEEV 
1141 - TTGTAGGACGGTGGCAGCATTGTTATTAGGATTGCGGGTGCCAATGTGGTCTTTGGGTGT - 1200 
^ L * HGGS IVI RIAGANVVFGC 
-C S TVAALLLGLRV PMWSLGV 
VA;RWQHCY * DCGCQCGLWVY 
1201 - ATTCAA6GCTCCCTCAGTTGCAACCCATACGATGCCTTCTTTGTTAGCGCCGTAGGGAAG - 1260 
-IQC?SLSCNPYDAFH'VSAV GK 
-FKAPSVATHTMPSLLAP* GS 
S R LPQLQP IRCLLC* RRREV 
12 61 - TGAAGCTTCTGGGCCAGTTCCTAGGTAATAGAAGTACCATCTGGGGCTGAGCTCTTTCAT - 1320 

- * SFWAS5*VIEVPSGAELFH 
-EA.SGPVPR* *KYHLGLSSFI 

KLLGQFLGNRSTIWG*ALSF 
1321 - TTTGCCGTCACCACCACGAACTCGTCGGGTAGCTCTTCGGTAGTAGCCAATTTGGTCATC - 138 0 
-FAVTTTNSSGSSSVVANLVI 
-LP SPPRTRRVALR**PIWSS 
CR HHHELVG*LFGSSQFGHL 
1381 - TGGACCACTATTGGTGTTGATTGGAACGGCCTGGCCTCGAGGGAATCTAAGTTCCTCCTT - 1440 
-WTTIGVDWNALASRESKFLL 
-GP'LIjVLIGT PWPRGNLSSSL 
DH'YWC*LERPGLEGI*VPPC 
1441 - GCCATGGTGAGTGAGAGCTGTGAACCAAGACGCAGTATTATTGGGTAAACCTTGGGGTCG - 1500 
-AM LSESCEPRRSI I G * T L G S 

- P C * V RAVNQ DAV LLGK P W GR 

H A E * E L * TKTQYYWVNLGVG 
1501 - GCGCTGTTTTGGCCTTGCCCCATTGCGTCCTCCATTCTGGTTATTGTCAGTTGAATCTGT - 1560 
-ALFV? PCPIASS ILVIVS * I C 
-RC FGLAPLRPPFWLLSVESV 
AV LALPHCVLHSGYCQLNLW 
1561 - GGGTCCACCAAATGTAATGCGGGGGGCACTACGTTGGTTTGATTGGGGTCCATTATCAGA - 1620 
-GST'KCMAGGTTLV*LGSIIR 
-GPiPNVMRGALRWFDWGPLSD 
VH QM*CGGHYVGIiIGVHYQT 
1621 - CATTTTAATTTGTTCGTTTATTTAAAACAACAAGTACGTCTCTAAATGCAGCAGTTTGGT - 1680 

- H FNL FVYLKQQV R L * M Q Q FG 

- IL"ICSFI*NNKYVSKCSSLV 

F* FVRLFKTTSTSLNAAVW* 
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1681 - GACCTTCATGAAGGTACGAACACCTAGCTATAAGCGCACCACCA6CTGGATCTTGACAGT - 1740 
-DLHE GTNT* L * A H H Q L DLDS 
-TF.MKVPTPSYKRTTSWILTV 
P S. * RYQHLAI SAP PAG S * QL 

1741 - TGATAGTAACATTAGGTGTGCATGTTTGAACCATAGTGTGCCATCTATGAAAAGGTAAAA - 1800 

- * * *H*VCMFEP*CAIYEKVK 

- DS N I RCACLNHSVPSMKR*H 

IVTLGVHV*TIVCHL*KGKT 
18 01 - CCTTTCCTAGAGCACAAAGCCAAGCAGTGCTATAAGTATTACCCCTAGTGTTGTACCTTA - 18 60 
-PFLE HKAKQCYKYYP* CCTL 
-LS*STKPSSAISITPSVVPY 
FP RAQSQAVL^VLPLVLYLT 
18 61 - CAAGGAtCTTCAAGCACATGAGGTTTATTAGATGCACAGCGCTGTACTACAGTGCATATG - 1920 
-QGSS ST*GLLDAQRCTTVHM 
-KD LQAREVY*MHSAVLQCIC 
RIFKHMRFIRCTALYYSAYA 
1921 - CAACTGCATAGAGAAATACAAGTCAAAACAATGAGAAGTTTCATGTTCGTTTAGACTTTG - 1980 
-QLHREIQVKTMRS F M F V * TL 
-HC IEKYKSKQ*EVSCSFRLW 
TA *RNTSQNNEKFHVRLDFG 
1981 - GTACAAGGTTCTTCTAGATCCTGGATTTCGAGTGAAAACCAAAATATAATAAGCATTATT - 204 0 
-VQGSSRSWISSENQNIISII 
-YK VLLDPGFRV KTKI * * A L L 
T R F F * ILDFE*KPKYNKHY* 
2041 - AAAACAAGGAATAGCAGAAAGGCTAAAAAGCACAAATAGAAGTCAATTAAAGTGAGCTCA - 2100 
-KTRN SRKAKKHK* K S IKVSS 

- K Q G I AERLKSTNRSQLK*AH 

NK-E* QKG*KAQIEVN * SELI 
2101 - TTCATTGTGTCTTTCTCTTAATGGTGAAGGAAAGTATTA?iAAATACTAGAGCAGCAACAA - 2160 
-Fits F S * W * SKVLKI LEQQQ 
S F.CL SLNGEAKY * KY * SSNN 
HS^VFLLMVKQS I KNTRAATM 
2161 - TGAGAAAAAGTGGCGAGTAGAGCTCTTGTTGAACCTCCTCTTGTCTGATGAAAAGTTTTG - 2220 
-+ERVASRALVEPPLV**KVL 

- EK : KWRVELLLN1>LLS DEKFW 

RK : SGE*SSC*TSSCI»MKSFG 
2221 - GTGAAAGTGATCTTGCACGCAGCTGATAGGTATGTCGAGTACCGTCAGCACAAGCAAAAG - 228 0 
-VKilLHAADRYVEYRQHKQK 
-*N : *SCTQLXGMSSTVSTSKS 
ET DLARS* *VCRVPSAQAKA 
2281 - CAAAGTGTGTGCTAGTGCAAGTTAGTGCAAATTTATTGTCAGCAAGAGGGTGAAATGGTG - 2340 

- Q S V C * CKLVQIY CQQEGEMV 
-KV CASAS*CKFIVSKRVKW* 

KC'VLVQVSAN LL SARG * NGE 
2341 - AATTGCCCTCGTATGTTCCTGATGGGCAAGGTTCTTTTAGTAGTACAGTCGTACCTCTAA - 24 00 
-NCPRMFLMGKVLLVVQSYL* 
-IA:LVCS*WARFF**YSRTSN 
LP'SYVPDGQGSFSSTVVPLT 
2401 ~ CACACTCCTGATAGTGATATAGCTCGCAAGATGTAAATACAATCAATGTCAGGAAGAGAA - 24 60 
-H T P DS DIARKM* IQSMSGRE 
-TL LI VI *LARCKYNQCQEEN 
HS ***YSSQDVN?INVRKRI 
24 61 - TAATTTTCATGTTCGTTTTATGGATAATCTAACTCCATAGGTTCTTCATCATCTAACTCC - 252 0 

- * FSCSFYG*SNSIGSSSSNS 

N F HV RFMDNLT P * VLHHLT P 
I F MFVLWI I * LHRFFI I * LR 
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2521 - GAATAATTCTTCTTAGTTAGAGGCTTAAATAATTGTCTCACTATTGAACTTATTATAACG - 2580 

- E * FFLVRGIiNNCLTIELI IT 
-NNSS*LEA*I IVSLLNLL*R 

IILLS*RLK*LSHY* TYYNV 

2581 - tcaagattccaaatagcaatcctgaaagtcctcataatgataatcaatatctctgctatt - 2 640 
-srfqiailkvlim::inisai 

-QDSK*QS*KSS***SISLLL 
KI-PNSNPESPHNDNQYLCYC 
2 641 - GTAACCTGGAAGTCAACAAGATGAAACATCTGTTGTCACTTACTGTACTAGCAAAGCAAT - 2700 
-VTWKSTR*NICCHI»I»Y*QSN 
-*PGSQQDETSVVTYCTSKAI 
NLEVNKMKHLLSLTVLAKQY 
2701 - ATTGTCGTTGCTACCGGCGTGGTCTGTATTTAATTTATAGTTTGCAATACGGTAGCGGTT - 27 60 
-IVVATGVVCI*FIVSNTVAV 
-LS LLPAWSVFNL * F P I R * RL 
CRCYRRGLYLIYSFQYGSGC 
2761 - GTATGCAGCAAAACCTGAATCAGTGCCTACACGCTGCGACGCTCCTAATTTGTAATAAGA - 2820 
-VCSKT* I SAYTLRRS* FVIR 
-YAAKPESVPTRCDAPNL* * E 
MQ QN LNQCLHAATLLI CNKK 
2821 - AAGCGTTCGTGATGTAGCCACAGTGATCTCTTTTGGCAGGTCCTTAATGTCACAGCGCCC - 2880 
-KRS * CSHS DLFWQVLNVTAP 
-SVRDVATVISFGRSLMSQRP 
AF-VM*PQ*SLLAGP*CHSAL 
2881 - TAGGGAGTGTCCGGCCATTCGCAAGTGACCACGAATGATCACAGCACCAATGACAAGTTC - 2940 
-*GYSGHSQVTTNDH3TNDKF 

- R E C P AI R K * PRMI TAPMTSS 

GS VRPFASDHE* SQHQ*QVH 
2 941 - ACTTTCCATGAGCGGTCTGGTCACAATTGTCCCCCGGAGAGGCACATTGAGAAGAATGTT - 3000 
-TFHERSGHNCPPERHIEKNV 
-LSMSGLVT1VPRRGTLRRMF 
FP*AVWSQLS P G E A H * E E C L 
3001 - TGTTTCTGGGTTGAATGACCACATTGAGCGGGTACGAGCAAACAGCCTGAAGGAAGCAAC - 3060 
-CFWVE * P H * AGTSKQPEGSN 
-VSGLNDHIERVRANSLKEAT 
FLG*MTTLSGYEQTA*RKQR 
3061 - GAAGTAGCTAAGCCACATCAAGCCTACAATACAAGCCATTGCAATCGCAATCCCGCCAGT - 3120 
-EVAKPHQAYNTS HCNRNPAS 

- K * L S HI KPT I Q A I A I AI P PV 

SS*ATSSLQYKPLQSQS,RQS 
3121 - CACCCAATTAATTCTGTAGACAACAGCAAGCACAAAACAAGCAAGTGTTACT GGCCACAA - 3180 
-HPINSVDTSISKHKTSKCYWPQ 

- T Q L I L * TTASTKQASVTGHK 

P N * FCRQQQAQNKQVLLATR 
3181 ~ GAGCCAGAGGAAAACAAGCTTTATTATGTACAAAAACCTGTTCCGATTAGAATAGGCAAA - 3240 
-EPEENKLYYVQKPVPIRI GK 
-SQ RKTS FIMYKNLFRLE* AN 
ARGKQALLCTKTCSD*NRQI 
3241 - TTGTAGTAACATAATCCAGGCTAGGAATAGGAAACCTATTACTAGGTTCCATTGTTCCAG - 3300 
-L**HNPG*E*ETYY*VPI>FQ 
-CSNI IQARNRKPITRFHCSR 
VVT*SRLGIGNLLLGSIVPG 
3301 - GAGTTGTTTAAGCTCCTCAACGGTAATAGTACCGTTGTCTGCCATGATAAGCAATGTTAA - 3360 
-ELFKLLNGNSTVVCHDKQC* 
-SCLSSSTVIVPLSAMISMVK 
VV*APQR**YRCLP**AMLK 
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33 61 - AGTTCCAAACAGAATAATAATAATAGTTAGTTCGTTTAGACCAGAAGATCAGGAACTCCT - 34 20 
-SSKQNNNNS *FV*TRRSGTP 

- V P : N R I I I JVSSFRPEDQELL 

F Q T E * * * * LVRLDQKIRNSF 
3421 - TCAGAAGAGTTCAGATTTTTAACACGCGAGTAGACGTAAi^CCGTTGGTTTTACTAAACTC - 34 80 
-SEEFRFLTRE* T*TVGFTKL 
-QKSSDF*HASRRKPLVLLNS 
RR. VQIFNTRVDVNRWFY*TH 
3481 - ACGTTAACAATATTGCAGCAGTACGCACACAATCGAAGCGCAGTAAGGATGGCTAGTGTG - 3540 
-TLTILQQYAHNRSAVRMASV 
-R*QYCSSTHTIEAQ*GWLV* 
VNNIAAVRTQSKRSKDG*CD 
3541 - ACTAGCAAGAATACCACGAAAGCAAGAAAAAGAAGTACGCTATTAACTATTAACGTACCT - 3600 
-TSKNTTKARKRS TLLT INVP 
-LA RIPRKQEKEVRY*LLTYL 
*QEYHESKKKKYAINY* RTC 
3601 - GTTTCTtCCGAAACGAATGAGTACATAAGTTCGTACTCACTTTCTTGTGCTTACAAAGGC - 3660 
-VSSETNEY I SS YSLSCAYKG 
~FLPKRMST*VRTHFLVLTKA 
FFRNE*VHKFVLTFLCLQRH 
3 661 - ACGCTAGTAGTCGTCGTCGGCTCATCATAAATTGGATCCATTGCTGGATTAGCAACTCCT - 3720 
-TLVVVVGSS*IGSIAGLATP 
-R**SSSAHHKLDPLLD*QLL 
AS SRRRLIINWIHCWISNS* 
3721 - GAAGAGCCGTCGATTGTGTGTATTTGCACATTCGGTGGGTCTTTAACAAGCTTGTTAAAG - 3780 
-EE PS I VC I CTFGGSLTSLLK 
-KS RRLCVFAHSVGL*QAC * R 
RAVDCVYLHIRWVFNKLVKD 
3781 - ATGAAGAATGTAGCATTTTCAATAGCAGTGTCTGTAGTAATTTGTGTAGACTCAAGCTGG - 3840 
-MKNVAFS I PVSVVICVDSSW 

- * R M * HFQYQCL* * F V * TQAG 

EECSIFNTSVCSNLCRLKLV 
3841 - TAGTAAACTTCGGTGAAATAGCCATGTACAACGACATAGTCTTTAACACCTGAGTGCCTA - 3900 
-**TSVK*PCTTT*SLTPECL 
-SKLR*NSHV.QRHSL*HLSAY 
VN FGE I AM YN DI VFNT * VP I 
3 901 - TCCTCAGAATAACCACCAATTTGGTAGTCTTCTTTGAGTTTTGGTGTTGAAATGCCGTCA - 3960 
-SSE*PPIW*SSLSF GVEMPS 

- PQ'NNHQFGS LL * VLVLKCRH 

IiR. I TTNLVVFFEFWC* NAVT 

3 961 - CCTTCAGTAACGACAATTGTATCTGTGACACTGTTATATGGTATACAGTAGTCATAGTTA - 4020 

-PSVTTIVSVTLLYGIQ* S * L 

- LQ*RQLYL* HCYMVYSSHSY 

FSNDNCICDTVIWYTVVIVM 
4021 - TGTGTGTGCCAGCAAACAAAGTAGTTGGCATCATAAAGTAATGGGTTCTTGGATTTGCAC - 4080 
-CVCQQTK*LAS*SNGFLDLH 
-VCASKQSSWHHKVMGSWICT 

CVPANKVVGIIK*WVLGFAL 

4 081 - TTCCAACAAAGCCAACATCTCATAATAATTCTACATGCGTTGATGCATTGTAGAAAATAT - 414 0 

-FQQSQHLI I I L H A L MHCRKY 
-SNKANIS* * FYMR*CIVENI 
PTKPTSHNNSTCVDAL* KIY 
4141 - ATCAAGGCATAGAGGTACAAAAATTGCGCCTCCTTACCTGCAGCGACAAGCAAAAGATGT - 420 0 
-IKA^RYKNCASLPAATSKRC 

- SRHRGTKIAPPYLQRQAKDV 

QG I E VQKLRL LT C S DKQ KM * 
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4201 - GAATAGATGGTAACAAATAGCAGCAGTAAATTGCAAATGilACTGGAAGCCCTTATAAAGG - 4260 
-E*MVTNSSSKLQMNWKPL*R 
-NRW*QIAAVNCK* TGSPYKG 
I DGNK*QQ* IANELEALIKG 
4261 - GCTAGCTGCCATCTTTTATTGAGCGCAATTATTTTGGTAGCGCTCTGAAAAACAGCAAGA - 4320 
-ASCHLLIiSAIILVAIi*KTAR 
-LAAIFY*AQLFW*RSEKQQE 
*LPSFIERNYFGSALKNSKK 
4321 - AATGCAACGCCAATAACAAGCCATCCGAAAGGGAGTGAGGCTTGTAGCGGTATCGTTGCT - 4380 
-NATPITSHPKGSEACSGIVA 
-MQRQ*QAIRKGVRLVAVSLL 
CNANNKPSERE* G L * RYRCC 
4381 - GTAGCATGAACAGTACTTGCAGGAGAAGCATTGTCAATTTTTACTGGCTGTGCAGTAATT - 4440 
-VA*TVIiAGEALS I FTGCAVI 

- * HEQYI>QEKHCQFLLAVQ*L 

SMNSTCRRSIVNFYWLCSN* 
4 441 - GATCCAAGAGTAAAAAATCTCATAAACAAATCCATAAGTTCGTTTATGTGTAATGTAATT - 4500 
-DPRVKNLINKSISSFMCNVI 
-IQE*KIS*TNP*VRLCVM*F 
SKSKKSHKQIHKFVYV*CNL 
4501 - TGACACCCTTGAGAACTGGCTCAGAGTCATCCTCATCAA^vCTTGCAGCAAGAACCACAAG - 4560 

- * H P * ELAQ S H P H QT CS KNH K 
-DTLENWLRVILIKLAARTTR 

TPLRTGSESSSSNLQQEPQE 
4 561 - AGCATGCACCCTTGAGGCAACTGCAACAACTAGTCATGCAACAAAGCAAGATTGTAACCA - 4 62 0 
-SMHP * GNCNN * SCNKARL* P 
-ACTLEATATTSHATKQDCNH 
HAPLRQLQQLVM-QQSKIVTM 
4 621 - TGACGATGGCAATTAGTCCAGCAATGAAGCCGAGCCAAACATACCAAGGCCATTTAATAT - 4 68 0 

- * RWQLVQQ* SRAKHTKAI * Y 
-DDGN*SSNEAEPNIPRPFNI 

T M . A I S PAMKP S Q T YQGHL I Y 
4 681 - ATTGCTCATATTTTCCCAATTCTTGAAGGTCAATGAGTGATTCATTTAAATTTTTAGCGA - 474 0 

- I A H I FPIIiEGQ*VIHLNF* R 
-LLIFSQFLKVNE* F I * IFSD 

CSYFPNS * R S M S DSFKFLAT 
4741 - CCTCATTGAGGCGGTCAATTTCTTTTTGAATGTTGACGACAGAAGCGTTAATGCCTGAAA - 4800 

- P H * GGQFLFEC*RQKR*CLK 
-LIEAVNFFLNVDDRSVNA*N 

SLRRS ISF^MLTTEALMPEM 
4801 - TGTCGCGAAGATCAACATCTGGTGATGTATGATTTTTGAAGTACTTGTCCAGCTCTTCTT - 4860 
-CRQDQHLVMY D F * STCPALL 
-VA KINIW*CMI FEVLVQLFF 
SPRSTSGDV*FLKYLSSSSL 
48 61 - TGAATGAGTCAAGCTCAGGTTGCAGAGGATCATAAACTGTGTTGTTAATGATGCCAATAA - 4 92 0 

- * MSQAQVAE DHKLCC* * C Q * 
-E*VKLRLQRI I NCVVNDANN 

NESSSGCRGS*TVLLMMPIT 
4921 - CGACATCACAATTTCCTGAGACAAATGTATTGTCTGTAGTAATTATTTGTGGAGAAAAGA - 4 98 0 
-RHHNFLRQMYCL* * L F V E K R 
-DITIS*DKCIVCSNYLWRKE 
TSQFPETNVLSVVIICGEKK 
4981 ^ AGTTCCTCTGTGTAATAAACCAAGAAGTGCCATTAAACACAAAAAGACCTTCACGAGGGA - 5040 

- S S S V * * TKKCH* TQKHLHEG 
-VPLCNKPRSAIKHKNTFTRE 

FLCVINQEVPLNTKTPSRGK 
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5041 - AGTATGCTTTGCCTTCATGACAAATTGCTGGCGCTGTGGTGAAGTTCCTCTCCTGGGATG - 5100 
-SMLCLHDKLLALW* SS SPGM 
-VCFAFMTNCWRCGEVPLLGW 
YALPS*QIAGAVVKFLSWDG 

5101 ~ GCACATACGTGACATGTAGGAAGACAACACCATGCGGGGCTGCTTGTGGGAAGGACATAA - 5160 

- A H T * HVGRQHHAGLLVGRT* 
-HIRDM*EDNTMRGCLWEGHK 

TYVTCRKTTPCGAACGKDIR 
5161 - GGTGGTAGCCCTTTCCACAAAAGTCAACTCTTTTTGATTGTCCAAGAACACACTCAGACA - 5220 
-GGSP FHKSQLFL IVQEHTQT 
-VVALSTKVNS F * LSKNTLRH 
W * PFPQKSTLFDCPRTHSDI 
5221 - TTTTAGTAGCAGCAAGATTAGCAGAAGCCCTGATTTCAGCAGCCCTGATTAGTTGTTGTG - 52 80 

- F * *QQD*QKP*FQQP*LVVV 
~FSSSKISRSPDFSSPD*LLC 

LVAARLAEALI SAALI SCCV 
5281 - TTACATAGGTTTGAAGGCTTTGAAGTCTGCCTGTAATTAACCTGTCAATTTGTACCTCCG - 5340 
*- L H RFEGFEVCL * LTCQFVPP 
-Y IGLKALKSACN* PVNLYLR 
T*V*RL*SLPVINLSICTSA 
5341 - CCTCGACTTTATCAAGTCGCGAAAGGATATCATTTAGCACACTTGAAATTGCACCAAAAT - 5400 
-PRLYQVAKGYHLAHLKLHQN 
-LDFIKSRKDII *HT*NCTKI 
STLSSRERIS FS TLEIAPKL 
5401 - TAGAGCTAAGTTGTTTAACAAGTGTGTTTAATGCTTGAGCATTCTGGTTAACAACGTCTT - 5460 
-*S*VV*QVCLMLEHSG*QRL 
-RAKLFNKCV*CLSILVNNVL 
ELSCLTSVFNA*AFWLTTSC 
54 61 - GCAGCTTGCCCAATGCAGTTGATGTTGTTGTAAGTGATTCTTGAATTTGACTAATCGCCT - 5520 
-AACPMQLMLL*VILEFD*SP 

- Q L A Q C S * C C C K * FLNLTNRL 

SLPNAVDVVVSDS*I*I*IAL 
5521 - TGTTAAATTGGTTGGCGATTTGTTTTTGGTTCTCATAGAGAACATTTTGGGTAACTCCAA - 5580 

- C * IGWRFVFGSHREHFG*LQ 
-VKlrVGDLFLVLIENILGNSN 

LNWLAICFWFS*RTFWVTPM 
5581 - TGCCATTGAACCTATATGCCATTTGCATAGCAAAAGGTATTTGAAGAGCAGCGCCAGCAC - 5640 
-CH * TYMFFA* QKVF EEQRQH 

- A I E P ICHLHSKRYLKSSAST 

PLNLYAICIAKGI * R A A P A P 
5641 - CAAATGTGCATCCAGCAGTGGCAGTACCACTAACTAGAGCAGCAGTGTAGGCAGCAATCA - 5700 
-QMS IQQWQYH*LEQQCRQQS 
-KCPSSSGSTTN*SSSVGSNH 
NV H PAVAV PL TRAAV * AAI I 
57 01 - TATCATCAGTGAGCAGAGGTGGCAACACTGTAAGTCCATTGAACTTCTGCGCACAAATGA - 5760 
-YHQ*AEVATL*VH*TSAHK* 
-IISEQRWQHCKSIELLRTNE 
S S V S RGGN-TV S P L N F CAQMR 

57 61 - GATCTCTAGCATTAATATCACCTAGGCATTCGCCATATTGCTTCATGAAGCCAGCATCAG - 5 82 0 

-DL*H*YHLGIRH IAS* SQHQ 
-ISSINIT*AFAILLHEASIS 
SLALIS PRHS PYCFMKPASA 

58 21 - CGAGTGTCACCTTATTAAAGAGCAAGTCCTCAATAAAAGACCTCTTAGTTGGGTTTAGAG - 588 0 

-RVSPY* RASPQ*KTS*LALE 
-ECHLIKEQVLNKRPLSWL*R 
S V'TLLKSKSS I KDLLVGFRG 
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5881 - GGTCAGGTAATATTTGTGAAAAATTAAAACCAGCAAAATATTTCAAAGTTGGGGTTTTGT - 5940 

- G Q V I F V K N * NHQNI SKLGFC 
-VR.*YL*KIKTTKIFQSWGFV 

SGNICEKLKPPKYFKVGVLY 
5941 - ACATTTGTTTGACTTGAGCGAACACTTCACGTGTGTTGCGATCCTGTTCAGCAGCAATAC - 6000 

- T F V * LERTLHVCCDPVQQQY 
-HLFDLSEHFTCVAILFSSNT 

ICLT*ANTSRVI»RSCSAAIP 
6001 - CTGAGAGTGCACGATTTAGTTGTGTGCAAAAGCTACCATATTGGAGAAGCAAATTAGCAC - 6060 

- L R V H DLVVCKSYHIGEAN*H 
-*ECTI*LCAKATILEKQIST 

ESARFSCVQKLPYWRSKLAH 
6061 - ATTCAGTAGAATCTCCGCAGATGTACATATTACAATCTACGGAGGTTTTAGCCATAGAAA - 6120 
-IQ*NLRRCTYYNLRRF*P*K 
-FSRI SADVHITIYGGFSHRN 
SVES PQMYILQSTEVLAIET 
6121 - CAGGCATTACTTCTGTAGTAATGCTAATTGAAAAGTTAGTAGGTATAGCAATGGTGTTAT - 6180 
-QAIiIiL**C*LKS**V*QWCY 
-RHYFCSNA*N*KV SRYSNGVI 
G I . T S VVMLI EKLVG I A M V L L 
6181 - TAGAGTAAGCAATTGAACTATCAGCAGCTAAAGACATAGTATAAGCCACAATAGATTTTT - 6240 
-*SKQLNYQHLKT * Y K P Q * I F 

- RV SN * T I S T * R H S I SHNRFL 

E*AIELSAPKDIV*ATIDFW 
6241 - GGCTAGTACTACGTAATAAAGAAACTGTATGGTAACTAGCACAAATGCCAGCTCCAATAG - 6300 

- G * YYVIKKLYGN * HKCQLQ* 
-ASTT**RNCMVTSTNASSNR 

LVLRNKETVW*LAQMPAPIG 
6301 - GAATGTCGCACTCATAAGAAGTGTCGACATGCTCAGCTCCTATAAGACAGCCTGCTTGAG - 6360 
-ECRTHKKCRHAQLL* DSLLE 

- N V "A L IRSVDMLSSYKTACLS 

MS HS*EVSTCSAPIRQPA*V 
6361 - TCTGGAATACATTGTTTCCAGTAGAATATATGCGCCAAGCTGGTGTGAGTTGATCTGCAT - 6420 
-SGI HCFQ * N I C A K L. V * VDLH 
-LEYIVSSRIYAPSWCELICM 
WNTLFPVEYMRQAGVS* S A * 
6421 - GAATTGGTGTAGAAACATCAGTGCAGTTAACATCTTGATATAGAACAGCAACTTCAGATG - 648 0 
-ELL* KHQCS*HLDIEQQLQM 
~NCCRNISAVNILI*NSNFR* 
IAVETSVQLTS* YRTATSDE 
6481 - AAGCATTTGTTCCAGGTGTAATTACACTTACACCCCCAAAAGAGCAAGGTGAAATGTCTA - 654 0 
-KHLFQV* LHLHPQKSKVKCL 

- S I 'CSRCNYTYTPKRAR * N V * 

AFVPGVITLTPPKEQGEMSN 
6541 - ATATTTCAGATGTTTTAGGATCTCGAACGGAATCAGTGAAATCAGAAACATCACGGCCAA - 6600 
-I FQMF* DLERNQ*NQKHHGQ 
-YFRCFRISNGISEIRNITAK 
ISDVLGSRTESVKSETSRPN 
6601 - ATTGTTGAAATGGTTGAAATCTCTTTGAAGAAGGAGTTAACACACCAGTACCAGTGAGTC - 6660 
-I VEMVEX SLKKELTHQYQ*V 

- LL KWLKS L * RRS * HTS TSES 

C*NG*NLFEEGVNTPVPVSP 
6661 - CATTAAAATTAAAATTGACACACTGGTTCTTAATAAGGTCAGTGGATAATTTTGGTCCAC - 6720 

- fl * N * N * H T G S * * GQWI ILVH 
-IKIKIDTLVLNKVSG* FWST 

LKLKLTHWFLIRSVDNFGPQ 
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6721 - AAACCGTGGCCGGTGCATTTAAAAGTTCAAAAGAAAGTACTACAACTCTGTAAGGTTGGT - 67 80 
-KPWPVHLKVQKKVLQLCKVG 
-NRGRCI *KFKRKYYNSVRLV 
TVAGAFKSSKESTTTL*GW* 

67 81 - AGCCAATGCCAGTAGTGGTGTAAAAACCATAATCATTTA2VTGGCGAATAACAATTAAGAG - 68 40 
-SQCQ * WCKNHNHLMANNN * E 

- ANASSGVKTIII*WPITIKS 

PMPVVV*KP*SFNGQ*QLRA 
6841 - CAGGTGGGGTGCAAGGTTTGCCATCAGGGGAGAAAGGCACATTAGATATGTCTCTCTCAA - 6900 
-QVGCKVCHQGRKAH * I CLSQ 
-RWGARFAJRGERH I R Y V $ L K 
GGVQGLPSGEKGTLDMSLSK 
6901 - AGGGCCTAAGCTTGCCATGTCTAAGATACCTATATTTAT^TTATAATTACCAGTTGAAG - 6960 

- R A * A C H V * DTYI YNYNYQLK 
-GP.KLAMSKIPIFIIIITS*S 

GLSLPCLRYLYL*L*LPVEV 
6961 - TAGCATCAATGTTCCTAGTATTCCAAGCAAGGACACAACCCATGAAATCATCTGGCAATT - 7020 
-*HQCS*YSKQGHNP*NHLAI 
-SINVPSIPSKDTTHEIIWQF 
ASMFLVFQARTQPMKSSGNL 
7 021 - TATAATTATAATCAGCAATAACACCAGTTTGTCCTGGCGCTATTTGTCTTACATCATCTC - 7 080 
-YNYNQQ* HQFVLALFVLHHL 

- IIIXSNNTSLSWRYLSYIIS 

*L*SAITPVCPGAlCkTSSP 
7081 - CCTTGACTACAAAAGAATCTGCATAGACATTGGAGAAGCAAAGATCATTCAACTTAGTGG - 7140 

- p * LQKNLHRHWRSKDH S T * W 

- LDYKRICIDIGEAKI IQLSG 

LTTKESA*TLEKQRSFNLVA 
7141 - C AGAA ACG CCAT AGC AC TT AAAG GT T G AAAA AAATG T TGAGT TGT AGAG CAC AGAGT AAT - 7200 
-QKRH S T* RLKKMLS) CRAQ SN 
-RNAIALKG* KKC*VVEHRVI 
ETP*HLKVEKNVEL*STE*S 
7201 - CAGCAAGACAATTAGAAATTTTTTTTCTCTCCCATGCATAGACAGAAGGGAATTTAGTAG - 72 60 
-QQHN*KFFFSPMHRQKGI** 

- SNT IRNFFSLPCIDRREFSS 

ATQLEIFFLSHA*TEGNLVA 
7261 - CATTAAAAACCTCTCCAAAAGGACACAAGTTTGTAATATTAGGGAATGTCACAAGATCTC - 7320 
-H*KPLQKDTSL*Y*GISQHL 
-IKNLSKRTQVCNIRES H TSJ X S 
LKTSPKGHKFVI LGNLTTSP 
7321 - CTGAGGGAACAACCCTGAAATTAGAGGTCTGGTAAATTCCTTTGTCAATCTCAAAGCTCT - 7380 
-LREQP * N *RSGKFLCQSQSS 

- *GNNPEIR6LVNSFVNLKAL 

EGTTLKLEVW*IPLSISKLL 
7381 - TAACAGAGCATTTGAGTTCAGCAAGTGGATTTTGAGAACAATCAACAGCATCTGTGATTG - 744 0 
-*QSI*VQQVDFENNQQHIi*L 
-NRAFEFSKWILRTINSICDC 
TSHLSSASGF*EQSTASVIV 
7441 - TACCATTTTCATCATACTTGAGCATAAATGTAGTTGGCTTTAAATAGCCAACAAAATAGG - 7500 
-YHFRHT* A*M*LAL1SISQQNR 
-TIFI ILEHKCSWIi* IANKIG 
PFSSYLSINVVGFK*PTK*A 
7501 - CTGCAGCTGACGTGCCCCAAATGTCTTGAGCAGGTGAAAAGGCTGTAAGAATGGCTCTAA - 7560 
-LQLTCPKCLEQVKRIi*BWL* 
-CS*RAPNV1>SR*KGCKNGSK 
AADV PQMS *AGEKAVRMALK 
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7561 ~ AATTTGTAATGTTAATACCAAGAGGCAACTTAAAAATAGGTTTCAAAGTGTTAAAACCAG - 7620 
-NL*C*YQEAT*K*VSKC*NQ 
-ICNVNTKRQLKNRFQSVKTR 
FVMLIPRGNLKIGFKVLKPE 
7621 - AAGGTAGATCACGAACTACATCTATAGGTTGATAGCCCTTATAAACATAGAGAAACCCAT - 7680 
-KVDHELHL * VDS PYKHRETH 
- R * ITNYIYRLIALINIEKPI 
GRSRTTSIG**PL*T*RNPS 
7681 - CTTTATTTTTAAACACAAACTCTCGTAAGTGTTTAAAATTACCTGACTTTTCTGAAACAT - 7740 
-LYF*TQTLVSV * NYLTFLKH 
-FIFKHKLS*VFKIT*LF*NI 
LFLNTNSRKCLKLPDFSETS 
7741 - CAAGCGAAAAGGCATCAGATATGTACTCGAAAGTGCAATTAAATGCATTATCGAATATCA - 7800 
-QAKRHQICTRKCN*MHYRIS 
-KRKGIRYVLESAIKCI IEYH 
SEKAS DMYSKVQLNALSNI I 
7801 - TAGTATGTGTCTGTGTACCCATGGGTTTAGAAACAGCAA^vGAAAGGGTTGTCACACAATT - 7860 
~*YVSVYPWV*KQQRKGCHTI 
-SMCLCTHGFRNSKERVVTQF 
VCVCVPMGLETAKKGLSHNS 
7861 - CAAAGTTACATGCTCGTATAACAACATTAGTAGAATTGTTAATAATAATCACCGACTGTG - 792 0 
-QSYMLV*QH**NC***SPTV 
-KVTCSYNNISRIVNNNHRL* 
KLHARIT TLVELLII ITDCD 

7 921 - ACTTGTTGTTCATGGTAGAACCAAAAACCCAACCACGGACAACATTTGATTTCTCTGTGG - 7980 

-TCCSW*NQKPNHGQHL I SLW 
-LVVHGRTKNPTTDNI* FLCG 
LLFMVEPKTQPRTTFDFSVA 
7981 - CAGCAAAATAAATACCATCCTTAAAAGGTATGACAGGGTTGCCAAACGTATGATTAATAG - 804 0 
-QQNKYHP * K V * QGCQTYD* * 
-SK1NTILKRYDRVAKRMINS 
AK*IPSLKGMTGLFNV*LIV 

8 041 - TATGAAACCCTGTAACATTAGAATAAAATGGAAGAAATAAATCCTGAGTTAAATAAAGAG - 8100 

-YETL* H * N K M E E IN PELNKE 
-MKPCNIRIKWKK*ILS*IKS 
*NPVTLE*1S!GRNKS*VK*RV 
8101 - TGTCTGATCTAAAAATTTCATCAGGATAGTAAACCCCCCTCATAGATGAAGTATGTTGAG - 8160 
-CLI*KFHQDSKPPS*MKYVE 
-V*SKNFIRIVNPPHR*SMLS 
SDLKISSG**TPLIDEVC*V 
8161 - TGTAATTAGGAGCTTGAACATCATCAAAAGTGGTGCACCGGTCAAGGTCACTACCACTAG - 8220 

- C N * ELEHHQKWCTGQGHYH * 
-VIRSLNIIKSGAPVKVTTTS 

*LGA*TSSKVVHRSRSLPLV 
8221 - TGAGAGTAAGAAATAATAAGAAAATAAACATGTTCGTTTAGTTGTTAACAAGAATATCAC - 828 0 

- * E * E I IRK*TCSFSC*QEYH 

- E S K K * * ENKHVRLVVNKN IT 

RVRNNKKINMFV*LLTRISL 
8281 - TTGAAACCACAACTCTGTTGTTTTCTCTAATGATAAGCCTACCTTTTTCCAGAAGAGAAT - 834 0 
-LKPQLCCFL* * *AYLFPEEN 
*NHNSVVFSNDKPTFFQKRI 
ETTTLLFSLMISLPFSRRE* 
8 341 - AAATCATATCATTGATTTGATTCTCCTTAAGAGACATTACAGCAGTTCCTCTTAATTTAA - 8 400 
-KSYH*FDSP*ETLQQFLLI* 
-NHIIDLILLKRHYSSSS*FK 
I ISLI * FSLRDITAVPLNLR 
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8401 - GAGGAAATTTGCTCATGTCAAAGAGTGAATAGGAAGACAACTGGATAGGATTTGTGTTCC - 8460 
-EEICSCQRVNRKTTG* DLCS 
-RKFARVKE* IGRQLDRICVP 
GNLLMSKSE*EDNWIGFVFL 
84 61 - TCCAGAAAATGTAGTTAGCATGCATGGTATAGCCATCAATTTGTTCCTTCGGCTTGCCAA - 8520 
~SRKCS*HAWYSHQFVPSACQ 
-PENVVSMHGIAINLFLRLAK 
QKM*LACMV*PSICSFGLPR 
8521 - GATAGTTAGCCCCAATTAAAAATGCTTCCGATGATGATGCATTTACATTTGTAACAAAAG - 8580 
-DS*PQLKMLPMMMHLHL*QK 
-IVSPN*KCFR**CIYICNKS 
*LAPIKNAS DDDAFTFVTKA 
8 581 - CTGTCCACCATGAGAAATGGCCCATAAGCTTGTAAAGGTCAGCATTCCAAGAATGCTCTG - 864 0 
-LSTMRNGP* ACKGQHSKNAL 
-CPP*EMAHKLVKVSIPRMLC 
VHHEK W P I S L * RSAFQECSV 
8 641 - TTATCTTTACAGCTATAGAACCACCCAGGGCTAGTTTTTGCTTTATAAATCCACACAGAT - 8700 
-LSLQL*NHP GLVFAL* IHTD 
-YLYSYRTTQG* FLLYKSTQI 
IFTAIEPPRASFCFINPHR* 
8701 ~ AAGTGAAAAACCCTTCTTTAGAGTCATTCTCTTTTGTCACATGTTTGGTCCTAGGGTCAT - 8760 
-K*KTLL* SHSLLSHVWS*GH 
-SEKPFFRVILFCHMFGPRVI 
VKNPSLESFSFVTCLVLGSY 
8761 - ACATATCGCTAATAATAAGGTCCCATTTATTAGCCGTATGTACTGTTGCACAGTCTCCAA - 882 0 

- T Y R * **GPIY*PYVLLHSLQ 
-HIANNKVPFISRMYCCTVSN 

ISLIIRSHLLAVCTVAQSPI 
8821 - TTAAAGTAGAATCTGCGTCGGAGACGAAGTCATTAAGATCTGAATCGACAAGTAGTGTGC - 8880 

- L K * N LRRR'R S H * DLNRQV VC 
-*'SRICVGDEVIKI*IDK*CA 

KVESASETKSLRSESTSSVP 
8 881 - CAGTTGGCAACCATTGTCTGAGCACAGCTGTACCrGGTGCAACTCCTTTATCAGAGCCAG - 8940 
-QLAT IV* AQLYLVQLLYQSQ 
-SWQPLSEHSCTWCNSFIRAS 
VGNHCLSTAVPGATPLSEPA 
8941 - CACCAAAGTGAATAACTCTCATGTTGTAGGGTACAGCTAAAGTAAGTGTATTTAAGTATT - 9000 
-HQSE * LSCCRVQLK*VYL S I 
-TKVNNSHVVGYS*SKCI*V:L 
P K * ITLML^GTAKVSVFKY* 
9001 - GACACAGTTGAGTATACTTTGCGACATTCATCATTATTCCTTTTGGTATAACAGCATTTT - 9060 
-DTVEYTLRHSSLFL L V * Q H F 
-TQLSILCDIHHYSFWYNS IF 
HS*VYFATFI I IPFGITAFS 
9061 - CACCATAATTCTGAAGGTCACACTTTTCAAGAAGCATTCTTTGCATCTTGTACAAGTTAG - 9120 
-HHNSEGHTFQEAFF A S C T S * 

- TIILKVTLFKKHSLHLVQVR 

P*F*RSHFSRSILCILYKLG 
9121 - GCATCGCAACACCTGGTTGCCACGCTTGACTTGCTTGTAGTTTTGGGTAGAAGGTTTCAA - 9180 
-ASQHLVATL DLLVVLGRRFQ 
~HRNTWLPRLTCL*FWVEGFN 
IATPGCHA*LACSFG*KVST 
9181 - CATGTCCATCCTTACACCAAAGCATGAATGAAATTTCAGCATAGTCAATTGTAACCTTGA - 924 0 
-HVHPYTKA*MKFQHSQL* P * 
-MSILTPKHE*NFSIVNCNLD 
CPSLHQSMNEISA*SIVTLT 
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9241 - CCACTTTTGAAATCACTGACAAATCTTGTGACTTTATTATCTCGACAAAGTCATCAAGTA - 9300 
-PLLKSLTNLVTLLSRQSHQV 
-HF*NH*QIL*LYYLDKVIK* 
TFEITDKSCDFI ISTKSSSK 
9301 - AAAGATCAATCACAGAACACACACATTTTGATGAACCTGTTTGCGCATCTGTTATGAAGT - 93 60 
-KDQSQNTHI LMNLFAHLL * S 
-KINHRTHTF* *TCLRICYEV 
RS ITEHTHFDEPVCASVMK* 
93 61 - AATTTTTCACTGTGCTGTCCATAGGGATAAAATCCTCTAATTTAAGTGGTGAATCTTGTG - 9420 
-NF5LCCP*G*NPLI * V V N L V 
I FHCAVHRDKI L * F K W * I L * 
FFTVLSIGIKSSNLSGESCE 
9421 ~ AGCGCTTGGCTAAGCCTATCATTAAATGAAGACCGCCAAGTTGTCCATGACTGAAATCTC - 94 80 
-SAWLSLSLNEDRQVVRD*NL 
-ALG*AYH*MKTAKLSMTEIS 
RLAKPIIK*RPPSCP*LKSP 
9481 - CATAAACGATGTGTTCGAAGGCATAGCCCTCGAGCTTATATCGCTGTATGAATTCATCCA - 9540 

-hkrcvrrhspray::av* i h p 
-indvfegialelislyefih 
*tmcska*psslyrcmnssi 
9541 - tagcgagctcgagaaagtcagtttccatttgtgatctgggcttaaaatcctctaagtctc - 9600 
-*raresqfpfviwa*nplsl 
-selekvsfhl* sglkil*vs 
assrksvsicdlglkssksl 
9601 - tgctctgagtaaagtaggtttcaggcaactgttgaataatgccgtctactttcttaaagt - 96 60 
-cse*srfqatve*crlls*s 
-alskvgfrqllnnavyflkv 
l*vk*vsgnc*impstflk* 
9 661 - agttaaactgtgtttttactgattctccaattaatgtgactccattgacgctagcttgtg - 9720 
-s*tvfllilqlm*lh*r*lv 
-vklcfy*fsn*cdsidaslc 
lncvftds pinvtpltlaca 
9721 - ctggtccctttgaaggtgttagacctttgactgaaccttctgttattaaaacaccattac - 978 0 
-lvplkvldl * lnllllkhhy 
-wsl*rc*tfd*tfcy*ntit 
gpfegvrpltepsviktplr 
9781 - gggcgtttctaaaaaggtctacctgtccttccactctaccatcaaacaagacagtaagtg - 984 0 
-grf*kglpvlplyhqtrq*v 
-gvskkvylsfhstikqdsk* 
aflkrstcpstlpsnktvse 
9841 - aagaacaagcactctcagtaggtttcttggcaatgtcagtcattgtgcagacacctattg - 9900 
-knkhsq*vswqcqslcrhll 
-rtstlsrflgnvshcadtyc 
eqalsvgflamsvivqtpiv 
9901 - tagatacatgtgctggggcttctcttttgtagtcccagattacagtattagcagcgatat - 9960 

- * IHVLGLLFCSPRL QY*QRY 
-RYMCWGFSFVVPDYSISSDI 

DTCAGASLL* SQITVLAAIS 
9961 - CAACACCCAAATTATTGAGTATCTTAATCTCTGGCACTGGTTTAATGTTACGCTTAGCCC - 10020 
-QHPNY*VS * S L A L V * C Y A * P 
-NTQIIEYLNLWHWFNVTLSP 

TPKLLSILISGTGLMLRLAQ 
10021 - AA AGC T C AAAT GC A AC AT T A AC AG G AAGTGTTGTCT TAT TTTCAAAG AT CTCCAC AT CAA - 10080 
-KAQMQH*QEVLSYFQRS PHQ 
-KLKCNINRKCCLIFKDLHIN 

SSNATLTGSVVLFSKISTSI 
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10081 - TACCATCTACCTTTGTGTAAACAGCATTATTAATGATGGZVAACAGGTGCTTCGCCGGCGT - 10140 
-YHLPLCKQHY* *WKQVLRRR 
-TIYLCVNSIINDGNRCFAGV 
PSTFV^TALLMMETGASPAC 

10141 - GTCCATCAAAGTGTCCTTTATTAACAACATTATAAGCCACATTTTCTAAACTCTGTAACC - 10200 
-VHQSVLY*QHYKPHFLNSVT 

- SIKVSFINNIISHIF*TL*P 

PSKCPLLTTL^ATFSKLCNL 
102 01 - TGGTAAATGTATTCCACAGGTTATAAGTATCAAATTGTTTGTAAATCCATAGGCTAAATC - 10260 
-W*MYSTGYKYQIVCKSIG*I 
-GKCIPQVISIKLFVNP*AKS 
VNVFHRL*VSNCL* IHRLNP 
10261 - CAGCAGAAATCATCATATTATATGCATCCAAGTACTGTCGGTACTCATTTGCATGGTGTC - 10320 
-QQKS S YYMHPS TVGTHLHGV 
-SRNHHIICIQVLSVLICMVS 
A E I I ILYASKYCRYSFAWCL 
10321 - TGCAAACAGCACCACCTAAATTGCATCGTGTAATACACGTAGCAGATTTGAGTGGAACAT - 10380 
-CKQHHLNCIV* Y T * Q I *VEH 
-ANSTT*IASCNTRSRFEWNI 
QTAPPKLHRVIHVADLSGT* 
10381 - AATCAATATCCGACACTACTTGTTTGCCATGAGACTCACAAGGACTATCAGAATAGTAAA - 10440 
-NQY PTLLVCHETHKDYQNSK 
-INIRHYLFAMRLTRTIRIVK 
SISDTTCLP*DSQGLSE**K 
10441 - AGAAAGGCAATTGCTTTAAATTAGTAAATGCACTTTTATCGAAAGCTGGAGTGTGGAATG - 10500 
-RKAIALN**MHFYRKLECGM 
-ERQLL*ISKCTFIESWSVEC 
KGNCFKLVNALLSKAGVWNA 
10501 - CATGCTTATTCACATACAAACTACCACCATCACAGCCTGGTAAGTTCAAGTTTGACAAGA - 10560 
-HAYSHTNYHHHSLVSS SLTR 
--MLIHIQTTTITAW*VQV*QD 
CLFTYKLPPSQPGKFKFDKT 
10561 - CTCTTGTGTCAAACCTACACACAATTGCATTGGCTGGGTAACGATCAACGTTACAATTCC - 10620 
-LLCQTYTQLHWLGNDQRYNS 

- SCVKPTHHCIGWVTINVTIP 

LVSNLHTIALAG*RSTLQFQ 
10621 - AAAACAAACAAACACCATCAGTGAATTTATCGTGATGTGTAGCATAAGAATAGAAGAGTT - 10680 
-KTW K H H Q * I YRDV* HKNRRV 
-KQTNTISEFIVMCSIRIEEF 
NKQTPSVNLS*CVA*E*KSS 
10681 - CCTCTATTTTGTAAGCTTTGTCACTACATGGCTGAGCATCGTAGAACTTCCATTCTACTT - 10740 
-PLFCKLCHYMAEHRRTS ILL 
-LYFVSFVTTWLSIVELPFYF 
SIL*ALSLHG*AS*NFHSTS 
10741 - CAGCCTGAGGCACACACTTGATAGCCTTTGGATTTCCAATGTCATGAAGAACTGGAAACT - 10800 
-QPEAHT* *PLDFQCHEEX. BT 

- SLRHTLDSLW1 SNVMKNWKL 

A*GTHLIAFGFPMS*RTGNL 
10801 - TATCAGCAAGCAATGCAGACTTCACAACCATGTGTTGTACTTTTCTGCAAGCAGAATTAA - 10860 
-YQQAMQTSQPCVVIjFCKQN* 

- I SKQCRLHNHVLYFSASRIN 

SASNADFTTMCCTFLQAELT 
10861 - CCCTCAGTTCATCTCCTATAATAGGGTATTCAACAGACCAATCAACGCGCTTAACAAAGC - 10920 
-PSVHLL* *GIQQTNQRA*QS 
-PQFI SYNRVFNRPINALNKA 

LSSSPIIGYSTDQSTRLTKH 
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10921 - ACTCATGGACTGCTAAACATCTAGTCATGATAGCATCACAACTAGCCACATGTGCATTTC - 10980 
-THGLLNI * S * * H H N * P H V H F 
-LMDC*TSSHDSITTSHMCIS 
SWTAKHLVMIASQLATCAFP 
10 981 - CATGTACCTGGCAATGTTGGTCATGGTTACTCTGAAGGTTACCCGTAAAGCCCCACTGCT - 11040 
-HVPGNVGHGYSEGYP* SPTA 
-MYLAMLVMVTLKVTRKAPLL 
CTWQCWSWLL*RLPVKPHC* 
11041 - GAAC AT C AAT C AT AAA TGGGT T AT AGAC AT AG TC AA AACCCAC AG AAT GAT T C C AGC AGG - 11100 
-EHQS*MGYRHSQNPQNDSSR 
-NINHKWVI DIVKTHRMIPAG 
TSIINGL*T*SKPTE*FQQA 
11101 - CATAAGTATCTGATGAAGTAGAAAAGCAAGTTGCACGTTTGTCACACAGACAACACGTTC - 11160 
-HKYLMK* KSKLHVCHT DNTF 

- I S I * * SRKASCTFVTQTTRS 

*VSDEVEKQVARLSHRQHVL 
11161 - TTTCAGGTCCAATCTTGACAAAGTACTTCATTGATGTAAGCTCAAAGCCATGCGCCCAAA - 11220 

- F Q V Q S *QSTSLM*AQSHAPK 
-FRSNLDKVLH* CKLKAMRPK 

SGPILTKYFI DVSSKPCAQR 
11221 - GGACGAACACGACTCTGTCTGACAATCCTTTCAGTGTATCACTGAGCATTTGTACTATCT - 11280 
-GRTRLCLTILSVYH* AFVLS 
-DEHDSV*QSFQCITEHLYYL 
TNTTLSDNPFSVSLS I C T I L 
11281 - TAATACGCACTACATTCCAGGGCAAGCCTTTATACATGAGTGGTATAAGATGTTTAAACT - 11340 
-*YALHSRASLYT*VV*DV*T 
-NTHY I PGQAFIHEWYKMFKL 
IRTTFQGKPLYMSGIRCLNW 
11341 - GGTCACCTGGTGGAGGTTTTGCATTAACTCTGGTGAATTCTGTGTTATTTTCAGTGTCAA - 11400 
-GHLVEVLH*LW* ILCYFQCQ 
-VTWWRFCINSGEFCVIFSVN 
SPGGGFALTLVNSVLFSVST 
11401 - CATAACCAGTCGGTACAGCTACTAAGTTAACACCTGTAGAAAATCCTAGCTGGAGAGGTA - 11460 
-HNQSVQLLS*HL*KILAGEV 
-ITSRYSY*VNTCRKS*LER* 

* PVGTATKLTPVENPSWRGR 

11461 - GGTTAGTACCCACAGCATCTCTAGTTGCATGACAGCCCTCTACATCAAAGCCAATCCACG - 11520 
-G*YPQHL*LHDSPLHQSQST 
-VSTHSISSCMTALYIKANPR 
LVPTASLVA*QPSTSKPIHA 

11521 - CACGAACGTGACGAATAGCTTCTTCGCGGGTGATAAACATATTAGGGTAACCATTGACTT - 11580 
-HERDE * L L R G * * T Y * GNH* L 
-TNVTNSFFAGDKHIRVTIDL 
RT*RIASSRVINILG*PLTW 

11581 - GGTAATTCATTTTGAAACCCATCATAGAGATGAGTCTACGGTAGGTCATGTCCTTTGGTA - 11640 
-GNSF*NPS*R + VYGRSCPLV 
-VIHFETHHRDESTVGHVLWY 

* FILKPIIEMSLR*VMSFGM 

11641 - TGCCTGGTATGTCAACACATAATCCTTCAGTCTTGAATTTTATATCAACGCTGAGGTGTG - 11700 
-CLVCQHI ILQS * I L Y Q R * G V 
-AWYVNT* SFSLEFY INAEVC 
PGMSTHNPSVLNFISTLRCV 

11701 - TAGGTGCCTGTGTAGGATGAAGACCAGTAATGATCTTACTACAGTCCTTAAAAAGTCCAG - 117 60 

- * V P V * DEDQ**SYYSP*KVQ 
-RCLCRMKTSNDLTTVLKKSS 

GACVG*RPVMILLQSLKSPV 
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11761 - TTACATTTTCTGCTTGTAATGTAGCCACATTGCGACGTGGTATTTCTAGACTTGTAAATT - 11820 
-LHFLLVM *PHCDVVFLDL* I 

- Y IFCL* C-SHIATWYF* TCKL 

TFSACNVATLRRGI SRLVNC 
11821 - GCAGTTTGTCATAAAGATCTCTATCAGACATTATGCACAAAATGCCAATTTTTGCCCTTG - 11860 
-AVCHKDLYQTLCTKCQFLPL 

- Q F V IKI SIRHYAQNANFCPC 

SLS*RSLSDIMHKMPI FALV 
11881 - TGATAGCCACATTGAAGCGGTTGACATTACAAGAGTGTGCTGTTTCAGTAGTTTGTGTGA - 11940 
-**PH*SG*HYKSVLFQ*FV* 
-DSHIEAVDITRVCCFSSLCE 
IATLKRLTLQE CAVSVV C V Nf 
11941 - ATATGACATAGTCATATTCAGAACCCTGTGATGAATCAACAGTCTGCGTAGGCAATCCTA - 12000 
~I*HSHIQNPVMNQQSA*AIL 
-YDIVIFRTL* * INSLRRQS* 
MT*SYSEPCD£STVCVGNPK 
12001 - AGATTTTTGAAGCTACAGCGTTCTGTGAATTATAAGGTGAGATAAAAACAGCTTTTCTCC - 12060 
-RFLKLQR SVNYKVR* KQLFS 
-DF*SYSVL*IIR*DKNSFSP 
IFEATAFCEL*GEIKTAFLQ 
12061 - AAGCAGGATTGCGTGTAAGAAATTCTCTTACAACGCCTATTTGAGGTCTGTTGATTGCAG - 12120 
-KQDCV*E ILLQRLFEVC*LQ 
-SRIACKKFSY"NAYLRSVDCR 
AGLRVRNSLTTPI* GLLIAD 
12121 - ATGAAACATCATGTGrAATAACACCTTTGTAGAACATTTTGAAGCATTGAGCTGACTTAT - 12180 
-MKHHV * * HLCRTF* S1ELTY 
-*NIMCUNTFVEHFEALS*LI 
ETSCVITPL*NILKH*ADLS 
12181 ~ CCTTGTGTGCTTTTAGCTTATTGTCATAAACTAAAGCACTCACAGTGTCAACAATTTCAG - 12240 
-PCVLLAYCHKLKHSQCQQFQ 
-LVCF*LIVIN*STHSVNNFS 
LCAFSLLS*TKALTVSTISA 
12241 - CAGGACAACGGCGACAAGTTCCAAGGAACATGTCTGGACCTATTGTTTTCATAAGTCTGC - 12300 
-QDNGDKFQGTCLDIiLFS * VC 
-RTTATSSKEHVWTYCFHKSA 
GQRRQVPRNMSGPIVFISLH 
12301 - ACACTGAATTAAAATATTCTGGTTCTAGTGTGCCTTTAGTCAGCAATGTGCGGGGGGCTG - 12360 

- T L N * N I LVLVCL * SAMCGGL 
-H*IKIFWF*CAFSQQCAGGW 

TELKYSGSSVPLVSNVRGAG 
12361 - GTAATTGAGCAGGATCGCCAATATAGACGTAGTGTTTTGCACGAAGTCTAGCATTGACAA - 12420 
-VIEQDRQYRRSVLBEV * H * Q 
-*LSRIANIDVVFCTKSSIDN 
N*AGSPI*T*CFARSLALTT 
12421 - CACTCAAGTCATAATTAGTAGCCATAGAGATTTCATCAAAGACTACAATGTCAGCAGTTG - 12480 
-HSSHLN**P*RFHQRLGCQQL 

- TQVI ISSHRDFIKDYNVSSC 

LKS*LVAIEISSKTTMSAVV 
12481 - TTTCTGGCAATGCATTTACAGTGCAGAAAACATACTGTTCTAGTGTTGAATTCACTTTGA ~ 12540 
-FLAMHLQCRKHTVLVLN S L * 
~FWQCIYSAENILF*C* IHFE 
SGNAFTVQKTYCSSVEFTLN 
12541 - ATTTATCAAAACACTCTACGCGCGCACGCGCAGGTATGATTCTACTACATTTATCTATGG - 12600 
-IYQNTLRAHAQV* FYYIYLW 
FIKTLYARTRRYDSTTFIYG 
LSKHSTRARAGMILLHLSMG 
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12601 - GCAAATAT TT T AAT GCCTTTT CACATAGGGCATCAACAGCTGCATGAGAGCATGCCGTAT - 12 660 
-ANILMPFHI GHQQLHESMPY 
-QIF*CIiFT*GINSCMRACRI 
KYFNAFSHRASTAA*EHAVY 
12661 - ACACTATGCGAGCAGATGGGTAATAGAGAGCAAGTCCGACGGCAAAATGACTCTTACCAG - 12720 
-TLCEQMGNREQVRWQNDSYQ 
-HYASRWVI ESKSDGKMTLTS 
TMRADG**RASPMAK*LLPV 
12721 - TACCAGGTGGTCCTTGGAGTGTAGAGTACTTTTGCATGCCGACCTTTTGATAATTTGCAA - 12780 
-YQVVLGV * S TFACRPF D N L Q 
-TRWSLECRVLLHADLLIICN 
PGGPWSVEYFCMPTF* * FAT 
12781 - CATTGCTAGAAAACTCATCTGAGATGTTGAGTGTTGGGTACAAGCCAGTAATTCTCACAT - 12840 
-HC*KTHLRC W VLGTSQ* FSH 
-IARKLI*DVECWVQASNSHI 
LLENS SEMIiSVGYKPVILT* 
12841 - AGTGCTCTTGTGGCACTAGAGTAGGTGCACTAAGTGGCATTACAGTGTGAGATGTCAACA - 12900 
-SALVALE * VH * VALQCEMS T 

- V L L W H * SRCTKWHYSVRCQH 

CSCGTRVGALSGITV* DVNT 
12 901 - CAAAGTAATCACCAACATTCAACTTGTATGTCGTAGTACCTCTGTACACAACAGCATCAC - 12960 
-QSNHQHSTCMS*YLCTQQHH 
-KVITNIQLVCRSTSVHNSIT 
K*SPTFNLYVVVPLYTTASP 
12 961 - CATAGTCACCTTTTTCAAAGGTGTACTGTCCAATCTGTACTTTACTATTTTTAGTTACAC - 13020 
-HSHLFQRCTLQSVLYYF* LH 
-IVTFFKGVLSNLYFTI FSYT 
*SPFSKVYSPICTLLFLVTR 
13021 - GGTAACCAGTAAAGACATAGTTTCTGTTCAATGGTGGTCTAGGTTTTCCAACCTCCCATG - 13080 
-GNQ*RHSFCSMVV*VFQPPM 
-VTSKDIVSVQWWSRFSNLP* 
* PVKT * FLFNGGLGFPTSHE 
13081 - AAAGATGCAATTCTCTGTCAGAGAGTACTTCGCGTACAGTGGCAATACCATATGACAGCT ~ 13140 
-KDAILCQRVLRVQWQYHMTA 
-KMQFSVREYFAYSGNT I * Q L 
RCNSLSESTSRTVAIPYDSL 
13141 - TAAATGTTTCCTCAGTGGCTTTGAGCGTTTCTGCTGCGMAAGCTTGAGTCTCTCAGTAC - 13200 
-*MFPQWL*AFLLRKA*VSQY 
-KCFLSGFERFCCEKLESLST 
NVSSVALSVSAAKSLSLSVQ 
13201 - AAGTGTTGGCAAGTATGTAATCGCCAGCATTAGTCCAATCACATGTTGCTATCGCATTGA - 13260 
-KCWQVCNRQH* SNHMLLSH * 
-SVGKYVIASISPITCCYRIE 
V L A S M * SPALVQSHVAIALK 
13261 - AGTCAGTGACATTGTCACTGCCTACACATGTGTTTTTGTATAAACCAAAAACCTGACCAT - 13320 

- S Q * HCHCLHMC FC INQKPDH 
-VSDIVTAYTCVFV*TKNLTI 

SVTLSLPTHVFLYKPKT* PL 
13321 - TAGCACATAATGGAAAACTAATGGGAGGCTTATGTGACTTGCAATAATAGCTCATACCTC - 13380 
-*HIMEN*WEAYVTCNNSSYL 
-ST*WKTNGRLM*LAIIAHTS 
AHNGKLMGGLCDLQ* * LI PP 
13381 - CTAGATACAGTTGTGTCACATCAGTGACATCACAACCTGGGGCATTGCAAACATAGGGAT - 13440 
-LDTVVSHQ * HHNLGHCKHRD 
-*IQLCHISDITTWGIANIGI 
RYSCVTSVTSQPGALQT* GL 
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13441 - TAACAGACAACACTAATTTGTGTGATGTTGAAATGACATGGTCATAGCAGCACTTGCAAC - 13500 
-*QTTLICVMLK*HGHSSTCN 
-NRQH*FV*C*NDMVIAAItAT 
TDNTNLCDVEMTWS - QHLQH 
13501 - ATAGGAATGGTCTCCTAATACAGGCACCGCAACGAAGTGAAGTCTGTGAATTGCACAATA - 13560 
-IGMVS *YRHRNEVKSVNCT I 
-*EWSPNTGTATK*SL*IAQY 
RNGLLIQAPQRSEVCELHNT 
13561 - CACAAGCACCTACAGCCTGCAAGACTGTATGTGGTGTGTACATAGCCTCATAAAACTCAG - 13620 
-HKHLQPARLYVVCT * PHKTQ 
-TSTYSLQDCMWCVHSLIKLR 
QAPTACKTVCGVYIAS * N S G 
13621 - GTTCCCAGTACCGTGAGGTGTTATCATTAGTTAGCATTACGGAATACATGTCCAACATGT - 13680 
-VPSTVRCYH * LALRNTCPTC 
-FPVP*GVIIS*HYGIHVQHV 
SQYREVLSLVSI TEYMSNMW 
13681 - GGCCAGTAAGCTCATCATGTAACTTTCTAATGTATTGTA?iA1ACAAGTGAAAGACATCAG - 13740 
-GQ*AHHVTF*CIVNTSERHQ 
~ASKLIM*LSNVL*IQVKDIS 
PVSSSCNFLMYCKYK*KTSA 
13741 - CATACTCCTGATTAGGATGTTTTGTAAGTGGGTAAGCATCAATAGCCAGTGACACGAACC - 13800 
-HTPD*DVL*VGKHQ*PVTRT 
-ILLIRMFCKWVSINSQ*HEP 
YS*LGCFVSG*ASIASDTNL 
13801 - TTTCAATCATAAGTGTACCATCTGTTTTGACAATATCATCGACAAAACAGCCTGCGCCTA - 13860 

- F Q S * V Y H L F * QYHRQNSLRL 

FNHKCTICFDNI IDKTACA* 
SI ISVP3VLTISSTKQPAPN 
13861 - ATATTCTTGATGGATCTGGGTAAGGCAGGTACACGTAATCATCTCCTTGTTTAACTAGCA - 13920 

- I FLMDLGKAGTRNHLLV* LA 

- Y S *WIWVRQVHVIISLFN*H 

ILDGSG*GRYT*SSPCLTSI 
13921 - TTGTATGCTGTGAGCAAAATTCGTGAGGTCCTTTAGTAAGGTCAGTCTCAGTCCAACATT - 13980 
-LYAVSKIREVL**GQSQSNI 
-CML*AKFVRSFSKVSLSPTF 
VCCEQNS*GPLVRSVSVQHF 
13981 - TTGCCTCAGACATGAACACATTATTTTGATAATAAAGAACTGCCTTAAAGTTCTTAATGC - 14040 
-LPQT*THYFDNKELP*SS*C 
-CLRHEHI ILI IKNCLKVLNA 
AS D M N T L F * * *RTALKFLML 
14041 - TAGCTACTAAACCTTGAGCCGCATAGTTACTGTTATAGCACACAACGGCATCATCAGAAA - 14100 
-*LLNLEPHSYCYSTQRHHQK 
-SY*TLSRIVTVIAHNGI-IRK 
ATKP*AA*LLL*HTTASSER 
14101 - GAATCATCATGGAGAAATGTTTACGCAGGTAAGCGTAAAACTCATCCACGAATTCATGAT - 14160 

- E S SWRNVYAGKRKTHPRIHD 
-NHHGEMFTQVSVKLIHEFMI 

IIMEKCLRR*A*NSSTNS*S 
14161 - CAACATCCCTATTTCTATAGAGACACTCATAGAGCCTGTGTTGTAGATTGCGGACATACT - 14220 
-QHPY FYRDTHRACVVDCGHT 
-NIPISIETLIEPVL*IADIL 
TSLFL*RHS*SLCCRLRTYIj 
14221 - TGTCAGCTATCTTATTACCATCAGTTGAAAGAAGTGCATTTACATTGGCTGTAACAGCTT - 14280 
-CQLSYYHQLKEVHLHWL* QL 
~VSYLITIS*KKCIYIGCNSL 
SAILLPSVERSAFTIiAVTA* 
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14281 - GACAAATGTTAAAGACACTATTAGCATAAGCAGTTGTAGCATCACCGGATGATGTTCCAC - 14340 
-DKC* RHY*HKQL*HHRMMFH 
-TNVKDTISISSCSITG*CST 
QMLKTLLA*AVVASPDDVPP 

14341 - CTGGTTTAACATATAGTGAGCCGCCACACATGACCATCTCACTTAATACTTGCGCACACT - 14400 
-LV*HIVSRHT*PSHLIIiAHT 

- W F N I **AATHDHLT*YLRTL 

GLTYSEPPHMTISLNTCAHS 
14 401 - CGTTAGCTAACCTGTAGAAACGGTGTGATAAGTTACAGCAAGTGTTATGTTTGCGAGCAA - 14 4 60 

- R * LTCRNGVISYSKCYVCEQ 
-VS*PVETV**VTASVMFASK 

LANL*KRCDKLQQVLCLRAR 
14461 - GAACAAGAGAGGCCATTATCCTAAGCATGTTAGGCATGGCTCTGTCACATTTTGGATAAT - 14520 
-EQERPLS *AC*AWLCH ILDN 
-NKRGHYPKHVRHGSVTFWI I 
TREAIILSMLGMALSHFG*S 
14521 - CCCAACCCATAAGGTGTGGAGTTTCTACATCACTGTAAACAGTTTTTAACATATTATGCC - 14580 

- P N P * GVEFLHHCKQFLTYYA 
-PTHKVWSFYITVNSF*HIMP 

QPIRCGVSTSL*TVFNILCQ 
14581 - AGCCACCGTAAAACTTGCTTGTTCCAATTACCACAGTAGCTCCTCTAGTGGCGGCTATTG - 14640 
-SHRKTCLFQLPQ* L L * W R L L 
-ATVKLACSNYHSSSSSGGY* 
PP*NLLVPITTVAPLVAAID 
14641 - ACTTCAATAATTTCTGATGAAACTGTCTATTTGTCATAGTACTACAGATAGAGACACCAG - 14700 

- T S II SDETVYLS *YYR*RHQ 
-LQ*FLMKLSICHSTTDRDTS 

FNNF**NCLFVIVLQIETPA 
14701 - CTACGGTGCGAGCTCTATTCTTTGCACTAATGGCATACTTAAGATTCATTTGAGTTATAG - 147 60 
-LRCELYSLH * W H T * DS F E L * 
-YGASSILCTNGILKIHLSYS 
TVRALFFALMAYLRFI * V I V 
14761 - TAGGGATGACATTACGCTTAGTATACGCGAAAAGTGCATCTTGATCCTCATAACTCATTG - 14820 

- * G * HYA*YTRKVHLDPHNSL 

- R D D I TLSIRSKCILILITH* 

GMTLRLVYAKSAS*SS*LIE 
14 821 - AGTCATAATAAAGTCTAGCCTTACCCCATTTATTAAATGGGAAACCAGCTGATTTATCCA - 148 80 

- S H N K V * PYP IY *MGNQLIYP 
-VI IKSSLTPFIKWETS*FIQ 

S**SLALPHLLNGKPADLSR 
14881 - GATTGTTAACGATTACTTGGTTGGCATTAATACAGCCACCATCGTAACAATCAAAGTATT - 14940 
-DC* R L L G W H * YSHHRNNQS I 
-IVNDYLVGINTATIVTIKVF 
LLTITWL ALIQPPS*QSKYL 
14941 - TATCAACAACTTCAACTACGAATAGGAGTTGTCTGATATCACACATTGTTGGCAGATTAT - 15000 
-YQQLQLRI G V V * YH TLLADY 
-INNFNYE^ELSDITHCWQII 
STTSTTNRSCLISHIVGRL* 
15001 - AACGATAATAGTCATAATCACTGATAGCAGCGTTGCCATCCTGAGCAAAGAAGAAGTGTT - 15060 
-NDNSHNH* *QRCH*PEQRRSV 

- T I IVIITDSSVAILSKEEVF 

R**S*SLIAALPS*AKKKCF 
15061 - TTAGTTCAACAGAACTTCCTTCCTTAAAGAAACCTTTAGACACAGCAAAGTCATAAAAGT - 15120 
-LVQQNFLP*RNL* TQQSHKS 

- * FNRTSFLKETFRHSKVIKV 

SSTELPSLKKPLDTAKS*KS 
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15121 - CTTTATTAAAATTACCGGGTTTGACAGTTTGAAAAGCAACATTGTTTGTTAGTGCAGCTA - 15180 
-LY*NYRV*QFEKQHCLLVQL 
-FIKITGFDSLKSNIVC*CSY 
LLKL PGLTV*KATLFVSAAT 

15131 - CTGAAAAGCATGTAGTGCGTTTATCTAGCAATAAATTGCCAGAAGCTGCATGCATAGCTG - 15240 

- L K S M * CVYLAINCQKLHA* L 
-*KACSAFI*Q* IARSCMHSW 

EKHVVRLSSNKLPEAACIAG 
15241 - GATCAGCAGCATACACTAAAAGTTCCTTGAAACTGAGAGGCGAGCTATGTAAGTTTACAT - 15300 
-DQQHTLKVP*N* DASYVSLH 

- I S S I H*KE 1 LETETRAM*VYI 

SAAYTKSSLKLRRELCKFTS 
15301 - CCTGATTATGTACGACTCCTAACTCACGAAAATGGTATCCAGTTGAAACAACMAAGGAA - 15360 
-PDYVRLLTHENGIQLKQQKE 
-LIMYDS*LTKMVSS*NNKRN 
*LCTTPNSRKWYPVETTKGT 
15361 - CACCATCTACAAATATTTTTCTTACTAGTGGTCCAAAACTTGTAGGTGGAAACACAGTAG - 15420 
-HHLQXFFLLVVQNL*VETQ* 
-TIYKYFSY*WSKTCRWKHSR 
PSTN I FLTSGPKLVGGNTVE 
15421 - AAAATAACACATTAAAGTTTGCACAATGAAGGATACACCTATCATCCAAACAGTTAATAC - 15480 

- K I T H * SLHNEGYTYHPNS * Y 
-K*HIKVCTMKDTPI IQTVNT 

NNTLKFAQ*RIHLS SKQLIQ 
154 81 - AATTGGGATGGTATGTCTGGTCCCAATATTTAAAATAACGGTCGAAGAGACAAAGTCTCT - 1554 0 
-NWDGMSGPNI *NNGRRDKVS 
-IGMVCLVPIFKITVEETKSL 
LGWYVWSQYLK* RSKRQSLS 
15541 - CTTCCGTAAAATCATATTTCAGCAAATCCCACTTAATAAGTGGTTTTGCGAGATCAGCAT - 15600 
-LP*NHISANPT* *VVLRDQH 

- F R K I IFQQIPLNKWFCEISI 

SVKSYFSKSHLI S GFARSAS 
15 601 - CCATATGGGACTCAGCAGCCAATGCCCTAGTCAAAGTGAGGATGGGCATCAGCAATGAGT - 15660 
-PYGTQQPMP* SK*GWASAMS 
-HMGLSSQCPSQSEDGHQQ*V 
IWDSAANALVKVRMGISNB* 
15 661 - AATATGAATCCACAATAGGAACTCCGCAGCCTGGTGCTACTTGTACGAAATCACCGAAAT - 15720 
-NMN P Q * ELRSLV LLVRNHRN 
-I* IHNRNSAAWCYLYE ITEI 
YEST IGTPQPGATCTKSPKS 
15721 - CGTACCAGTTCCCATTAAGATGCTGATTATCTAATGTCAGTACGCCTACAATGCCTGCAT - 15780 
-RTS 3 H * DPDYLMSVRLQCLH 
-VPVPIKI'LI I*CQYAYNACI 
YQFPLRS*LSNVSTPTMPAS 
15781 - CACGCATAGCATCGCAGAATTGTACAGTCTTTAATAATGATTGGCGTACACGCTCACCTA - 15840 
-HA* HRRIVQSLIMIGVHAHL 
-THSIAELYSL***LAYTLT* 
RIASQNCTVFNNDVJRTRSPK 
15841 - AGTTAGCATATACGCGTAAGATGTCAGGATTCTCTACGAAGTCATACCAATCCTTCTTAT - 15900 

- S * HIRVRCQDSLRSHTNPSY 
-VSIYA*DVRIL YEVIPILLI 

LAYT RKMSGFSTKSYQS FLL 
15901 - TGAAATAATCATCATCACAGCAATTGTATGTGACGAGTATTTCTTTTAATGTATCACAAT - 15960 
-*NNHHHSNCM*RVF LLMYHN 
-EIIIITAIVCDEYFF*CITI 

K*SSSQQLYVTSISFNVSQL 
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15961 - TACCCTCATCAAAATGACGTAGAGCATAGACTAAATCAGCCATTGTGTATTTAGTTAGAC - 16020 
-YPHQN DVEHRLNQPLCI * L D 
-TLIKMT*SID*ISHCVFS*T 
P S S K * RRA*TKSAJVYLVRR 
16021 ~ GCTGACGTGATATATGTGGTACCATGTCACCATCTACTCTAAACTTGAAAAAGTCATGGA - 16080 
-ADVIYVVPCHHLL* T * KSHG 
-LT*YMWYHVTIYSKLEKVMD 
*RDICGTMSPSTLNLKKSWT 
16081 - CAGCAACCGCTGGACAATCTTTAACCAAGTTATAAATAGTCTCTTCATGTTGGTAGTTAG - 16140 
-QQPLDNL* PSYK*SIiHVGS* 
- SNRWTI FNQVINSLFMLVVR 
ATAGQSLTKL* IVSSCW*LD 
16141 - ACATAGTATGCCTCTTAACTACAAAGTAAGAGTCTAATAJ\ATTGCCTTCCTCATCCTTCT ~ 16200 
-T*YAS*LQSKSLINCLPHPS 
-HSMPLNYKVRV** IAFLILL 
IVCLLTTK*ESNKLPSSSFS 
16201 - CCTGGAAGCGACAGCAATTAGTTTTTAGGAACTTTGCAAAACCAGCACTTTTTTCGTTGT - 16260 
-PGS DSN* FLGTLQNQH FFRC 
-LEATAI SF*ELCKTST FFVV 
WKRQQLVFRNFAKPALFSL* 
16261 - AAATATCAAAAGCCCTGTAGACGACATCAGTACTAGTGCCTGTGCCGCACGGTGTAAGAC - 16320 
-KYQK PCRRKQY * CLCRTV* D 
-NIKSPVDDISTSACAARCKT 
ISKAL*TTSVLVPVPHGVRR 
16321 - GGGCTGCACTTACACCGCAAACCCGTTTAAAAACGTTGATGCATCCGCAGACTGCATCAA - 16380 
-GLHLHRKPV*KR*CIRRLHQ 
-GCTYTANPFKNVDASADCIK 
AALTPQTRLKTLMHPQTASR 
16381 - GGGTTCGCGGAGTTGGTCACAACTACAGCCATAACCTTTGCACATTCCGCAGACGGTACA - 16440 
-GFAELVTTTAITFPHSADGT 
-GSRSWSQLQP*PFHIPQTVQ 
VRGVGHNYSHNLSTFRRRYR 
16441 - GACTGTGTTTCTAAGTGTAAAACCCACTGGGTCATTAGCACAAGTGGTAGGTATTTGGAC - 16500 
-DCVSKCKTHWVI STSGRYLD 
-TVFL SVKPTGSLAQVVGIWT 
L C F * V * NPLGH*HKW* VFGR 
16501 - GTACTTACCTTTCAAGTCACAGAATCCTTTAGGATTTGGATGGTCAATGTGGCATCTACA - 165 60 
-VLTFQVTES FRI WMVNVAST 
-YLPFKSQNPLGFGWSMWHLQ 
TYLSSHRIL* DLDGQCGIYN 
16561 - ATACAGACAACATGAAGCACCACCAAAGGACTCTTGGTCCATGTTAGCTTCTGGTGTTAC - 16620 
~ I Q T T * STTKGLLVHVSFWCY 
-YRQHEAPPKDSWSMLASGVT 
TDNMKHHQRTLGPC* LLVLQ 
16621 - AGTAATTGCCTGTCCTGTACCAGTGTGTGTACACAACATCTTCACACAGTTGGTGATTGG - 16680 
-SNCLSCTSVCTQHLHTVGDW 
-VIACPVPVCVHNIFTQLVIG 
*LPVLYQCVYTTSSHSW*LV 
16 681 - TTGTCCTCCACTTGCTAGGTAATCCTTATATGCTTTAGCAGGGTCTACTGCAAAAGCACA - 16740 
-LSSTC^VILICFSRVYCKST 
-CPPLAR* SLYALAGSTAKAQ 
VLHLLGNPYML*QGLLQKHR 
16741 - GAAGGAAAGCACAGTTGAATTGGCAGGTACTTCTGTAGCATTTCCAGCCTGAAGACGTAC - 16800 
-EGKHS * IGRYFCSISSLKTY 
-KESTVELAGTSVAFPA*RRT 
RKAQLNWQVLL*HFQPE DVL 
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16801 - TGTAGCAGCTAAACTGCCCAGCACCATACCTCTATTTAGGTTGTTTAAGCCTTTGATGAA - 16860 
-CSS*TAQHHTSI*VV*AFDE 
-VAAKLPST IPLFRLFKPLMK 
^QLNCPAPYLYLGCLSL* * S 

168 61 - GTACAAGTATTTCACTTTAGGCCCTTTTGGTGTGTCTGTAACAAACCTACAAGGTGGTTC - 16920 
-VQVFHFRP FWCVCNKPTRWF 
-YKYFTLGPFGVSVTNLQGGS 
TSISL*ALLVCL*QTYKVVP 

16921 - CAGTTCTGTGTAAATTGTACCTGTACCATCACTCTTAGGGAATCTAGCCCATTTGAGATC - 16980 
-QFCVNCTCTITLRESSPFEI 

- S S V * IVPVPSLLGNLAHLRS 

VLCKLYLYHHS*GI*PI*DL 
16981 - TTGGTGGTCTGATAGTAATGCCAGCACAAACCTACCTCCCTTCGAATTGTTATAGTAGGC - 17040 
-LVV***CQHKPTSLRIVIVG 
-WWSDSNAS TNLPPFELL* * A 
GGLIVMPAQTYLPSNCYSRQ 
17 041 - AAGTGCATTGTCATCAGTACAAGCTGTTTGTGTGGTACCAGCCGCACAGGACATCTGTCG - 17100 
-KCIVISTSCLCGTSRTGHLS 
-SALSSVQAVCVVPAAQDICR 
VHCHQYKLFVWYQPHRTSVV 
17101 - TAGTGCTACTGGACTCAGTTCATTATTCTGTAGTTTAACAGCTGAGTTGGCTCTTAGAGC - 17160 
-*CYWTQFIIL*FNS*VGS*S 
-SATGLSSLFCSLTAELALRA 
VLLDSVHYSVV*QLSWLLEL 
17161 - TGTAACAATAAGAGGCCAAGCCAAATTTGGTGAATTGTCCATGTTAATTTCACTAAGTTG - 17220 
-CNNKRPSQIW * IVHVN FTKL 
-VTIRGQAKFGELSMLISLS* 
*Q*EAKPNLVNCPC*FH*VE 
17221 - AACAATCTTGCTATCCGCATCAACAACTTGCTGGATTTCCCAGAGTGCAGATGCATATGT - 17280 
-NNLAIRINNLLDFPECRCIC 

- TILLSASTTCW1SQSADAYV 

QSCYPHQQLAGFPRVQMHM* 
17281 - AAAGGTGTTACCATCACAAGTGTTCTTGTAGGTACCATAATCAGGGACAACAACCATGAG - 17340 
-KGVTI TSVLVGT I IRDNNHE 
-KVLPSQVFL*VP*SGTTTMS 
RCYHHKCSCRYHNQGQQP*V 
17341 - TTTGGCTGCTGTAGTCAATGGTATGATGTTGAGTGGAACACAACCATCACGCGCATTGTT ~ 17400 
-FGCCSQWY DVEWNTT I TRI V 
-LAAVVNGMMLSGTQPSRALL 
WLL*SMV*C*V-EHNHHAHC* 
17401 - GATAATGTTGTTAAGTGCATCATTATCAAGCTTCCTAAGCATAGTGAAGAGCATTGTTTG - 17460 
-DNVVKCII IKLPKHSEEHCL 
-IMLLSASLSSFLSIVKSIVC 
*CC*VHHYQAS*A* * RALFA 
17 4 61 - CATAGCACTAGTTACTTTTGCCGTCTTGTCCTCAGATCTTGCCTGTTTGTACATTTGGGT - 17520 
-HSTSY FCPLVLRSCLFVHLG 
-IALVTFALLSSDLACLYIWV 
*H*LLLPSCPQILPVCTFGS 
17521 - CATAGCCTGATCTGCCATCTTTTCCAACTTGCGTTGCATGGCAGCATCACGGTCAAACTC - 17580 
-HSLICHLFQLALHGSITVKL 
IA*SAIFSNLRCMAASRSNS 
*PDLPSFPTCVAWQHHGQTQ 
17581 - AGATTTAGCCACATTCAAAGATTTCTTTAACTTTTTGAG^ACGACTTCAGAATCACCATT - 17 640 
-RFSHIQRFL^LFENDFRITI 
-DLATFKDFFNFLRTTSES PL 
I *PHSKISLTF*ERLQNHH* 
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17641 - AGCTACAGCCTGCTCATAGGCCTCCTGGGCAGTGGCATAAGCGGCATATGATGGTAAAGA - 17700 
-SYSLLIGLLGSGISGI * W * R 
-ATACS*ASWAVA*AAYDGKE 
LQPAHRP PGQWHKRHMMVKN 

17701 - ACTAAATTCTGAAGCAATAGCCTGAAGAGTAGCACGGTTATCGAGCATTTCCTCGCACAA - 17760 

- T K F * SNSLKSSTV IEHFLAQ 
-LNSEAIA* RVARLSS ISSHN 

*ILKQ*PEE*HGYRAFPRTT 
17761 - CCTATTAATGTCTACAGCACCCTGCATGGATAGCAAAACAGACAAAAGAGAAACCATCTT - 17820 
-PINVYSTLHG*QNRQKRNHL 
-LLMSTAPCMDSKTDKRETIF 
Y*CLQHPAWIAKQTKEKPSS 
17821 - CTCGAAAGCTTCAGTTGTGTCTTTTGCAAGAAGAATATCATTGTGGAGTTGTACACATTG - 17 880 
-LESFSCVFCKKNI I V E L Y T L 
-SKASVVSFARRISLWSCTHC 
RKLQLCLLQEEYHCGVVHIV 
17881 - TGCCCACAATTTAGAAGATGACTCTACTCTAAGTTGTTGAAGAACCGAGAGCAGTACCAC - 17 940 
-CPQFRR* LYS KLLKNREQYH 
-AHNLEDDSTLSC* RTESSTT 
PTI*KMTLL*VVEEPRAVPQ 
17941 - AGATGTGCACTTTACGTCAGACATTTTAGACTGTACAGTAGCAACCTTGATACATGGTTT - 18000 
-RCALYVRHFRLYSSNLDTWF 

- DVHFTS DI LDCTVATLIHGL 

MCTLRQTF*TVQ*QP*YMVY 
18001 - AGCTCCAATACCCAACAACTTAATGTTAAGCTTGAAAGCATCAATACTACTCTTAGGAGG - 18060 
-TSNTQQLNVKLES INTTLRR 
-PPIPNNLMLSLKASILLLGG 
LQYPTT*C*A*KHQYYS*EA 
18 061 - CAAAAGCCCCTGGGAGTTCATATACCTAAATTCTTGTGTAGAGACCAAGTAGTCATAAAC - 18120 
-QKPLGVHI PKFLCRDQVVIN 
-KSPWEFIYLNSCVETK*S*T 
KAPGSSYT*ILV*RPSSHKH 
18121 - ACCAAGAGTAAGCCTGAAGTAACGGTTGAGTAAACAGAAAAGGCCAAAGTAGCAGCAGCA - 18180 
-TKSKPEVTVE * TEKAKVAAA 
-PRVSLK*RLSKQKRPK*QQQ 
QE*A*SNG*VNRKGQSSSSN 
18181 - ACAATAGCCTAAGAAACAATAAACAAGCATGATACACTGTAAGGTGTTGCCAGTAATAAA - 18240 
-TIA*ETINKH D T L * GVASNK 
-Q*PKKQ*TSMIHCKVLPVIN 
NSLRNNKQA* YTVRCCQ** I 
18241 - TAACAATGGGTAATACTCAACACACACAAACACTATAGCTCTAGCTAAAAACATGATAGT - 18300 
~*QWVII»NTHKHYSSS*KHDS 

- N N G * YSTHTNT IALAKKMIV 

TMGNTQHTQTL*L*LKT**S 
18301 - CGTAACGACACCAGAATAGTTAGAGGTTACAGAAATAACTAAGGCCCACATGGAAATAGC - 183 60 
-RKIDTRIVRGYRNN* GPHGNS 

- V T T P E * LEVTE I TKAHMEIA 

* R H Q N S * R L Q K * LRPTWK*L 
18361 - TTGATCTAAAGCATTACCATAGTAGACTTTGTAAACAAGTGTAATGACATTCATCAGTGT - 18420 
-LI*SITIVDFVNKCNDIHQC 
-*SKALP**TL*TSVMTF1SV 
DLKHYHSRLCKQV**HSSVS 
18 421 - CCAAACACGTCTAGCAGCATCATCATAAACAGTGCGAGCTGTCATGAGAATAAGCAAAAC - 184 80 
-PNTSSSII INSASCHENKQN 

- QTRLAASS*TVRAVMRISKT 

KHV*QHHHKQCELS*E*AKL 
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18481 - TAAAGCTGAAGCATACATAACACAATCCTTAAGCCTATAACCAGACAAGCTAGTGTCAGC - 18540 
-*S*SIHNTILKPITRQASVS 
-KAEAYITQSLSL* PDKLVSA 
KLKHT*HNP*AYNQTS * C Q P 
18541 - CAATTCAAGCCATGTCATGATACGCATCACCCAGCTAGCAGGCATGTAGACCATATTAAA - 18600 
-QFKPCHDTHHPASRHVDH IK 
-NSSHVMIRITQLAGM*TILK 
IQAMS*YASPS*QACRPY*S 
18601 - GTAAGCAACTGTTGCAAGAGAAGGTAACAGAAACAAGCACAAGAATGCGTGCTTATGCTT - 18660 
-VSNCCKRR* QKQAQECVLML 
-*ATVAREGNRNKHKNACLCL 
KQLLQEKVTETS TRMRAYA* 
18 661 - AACAAGCAGCATAGCACATGCAGCAATTGCCATAATACCAAGAGTAAATGGCAAGAAAGC - 18720 
-NKQHSTCSNCHNTKSKWQES 
-TSSIAHAAIAI IPRVNGKKA 
QAA*HMQQJjP* YQE*MARKK 
18721 - ATTCTCGTAAACAAAGAAAAACAGTGACCACTGTGTACTTTGAACAAGAATCAATAGTGA - 18780 
-ILVNKEKQ* PLCTLNKNQ** 
-FS*TKKNSDHCVL*TRINSD 
SRKQRKTVTTVYFEQES I V M 
18781 - TGTCAAGAAAGTTAAAAGCATCCAATGATGAGTGCCCTTAACAATTTTCTTGAACTTACC - 18840 
-CQBS * KHPMMSALNNFLELT 
-VKKVKSIQ* *VPLTIFbNLP 
SRKLKASN DECP * Q F S * T Y L 
18 841 - TTGGAAGGTAACACCAGAGCATTGTCTAACAACATCAAATGGTGTAAACTCATCTTCTAA - 18900 
-LEGNTRALSNNI KWCKLI F * 
-WKVTPEHCLTTSNGVNSSSK 
GR*HQSIV*QHQMV*THLLK 
18901 - AATAGTGCTACCAAGGATAGTACGACCATTCATACCATTCTGCAGCAGCTCTTTCAAAGC - 18960 

- N SATKDSTTIHTILQQLFQS 
-IVLPRIVRPFIPFCSSSFKA 

*CYQG*YDHSYHSAAALSKQ 
18961 - AGCACACATATCTAAGACGGCAATTCCTGTTTGAGCAGAAAGAGGTCCCAATATGTCAAC - 19020 

- S TH I * DGNSCLSRKRSQYVN 
-AHISKTAIPV*AERGPNMST 

HTYLRRQFLFEQKEVPICQH 
19021 - ATGATCTTGTGTCAAAGGTTCATAGTTGTACTTCATTGCCACAAGGTTAAAGTCATTCAA - 19080 
-MILCQRFIVVLHCHKVKVIQ 
-*SCVKGS*LYFIATRLKSFK 
DLVSKVHSCTSLPQG*SHSK 
19081 - AGTAGTGGTGAATCTATTAAGAAACCACCTATCACCATTGATAACAGCAGCATACAGCCA - 19140 
-SSGESIKKPPITIDNSSIQP 
-VVVNLLRNHLSPLITAAYSH 
*W*IY*ETTYHH**QQHTAM 
19141 - TGCCAAAACATTTAATGTTATGGTTGTGTCTGTACCTGCAGCCTGTGCAGTTTGTCTGTC - 19200 

- C Q N I *CYGCVCTCSLCSLSV 
-AKTFNVMVVSVPAACAVCLS 

PKHLMLWLCLYLQPVQFVCQ 
19201 - AACAAATGGACCATAGAATTTACCTTCTAAGTCAGTACCAGCGTGTACTCCTGTTGGAAG - 19260 
-NKWTIEFTF*VSTSVYSCWK 
-TNGP*NLPSKSVPACTPVGS 
QMDHRIYLLSQYQRVLLLEA 
19261 - CTCCATATGATGCATATAGCAGAAAGACACGCAATCATAATCAATGTTAAAACCAACACT - 19320 
-LHMMHIAERHAI I INVKTNT 
-SI*CI*QKDTQS*SMLKPTL 
PYDAYSRKTRNHNQC* NQHY 
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19321 - ACCACATGATCCATTAAGGAAAGAACCTTTAATGGTATGATTAGGTCTCATGGCACACT6 - 19390 

- T T * S IKERTFNGMIRSHGTL 
-PHDPXjRKEPLMV *LGLMAH* 

HMIH*GKNL*WYD*VSWHTD 
19381 - ATAAACACCAGATGGTGAACCATTGTAGCATGCTAGAACTGAAAATGTTTGACCAGGTTG - 19440 
-INTRW*TIVAC*N*KCLTRL 
-*TPDGEPL*HARTENV*PGW 
KHQMVNHCSMLELKMFDQVG 
19441 - GATACGGACAAATTTATACTTGGGTGTCTTAGGGTTAGAJiGTATCAACTTTAAGCCTAAG - 19500 
-DTDKFILGCLRVRS INFKPK 
-IRTNLYLGVLGLEVSTLSLS 
YGQIYTViVS*G*KYQL*A*A 
19501 ~ CAGACAATTTTGCATAGAATGGCCAATAACACGAAGTTGAACATTGCCAGCCTGAACAAG - 19560 

- Q T I LHRMANNTKLNIASLNK 
-RQFCIEWPITRS*TLPA*TR 

DNFA*NGQ*HEVEHCQPEQE 
19561 - AAAGCTATGGTTGGATTTGCGAATGAGCAGATCTTCATAGTTAGGATTAAGCATGTCTTC - 19620 
-KAMVGFANEQI F IVRIKHVF 
-KLWLDLRMSRSS * LGLSMSS 
SXGWICE*ADLHS*D*ACLL 
19621 - TGCTGTGCAAATGACATGTCTTGGACAGTATACTGTGTCATCCAACCACAATCCATTAAG - 19680 
-CCANDMSWTVYCVIQPQSIK 
-AVQMTCLGQYTVS SNHNPLR 
LCK*HVLDSILGRPTTIH + E 
19 681 - AGTTGTAGTTCCACAGGTTACTTGTACCATGCACCCTTCAACTTTGCCTGACGGGAATGC - 19740 
-SCS STGYLYHAPFNFA* REC 
-VVVPQVTCTMHPSTLPDG1SIA 
L * FHRLLV PCTLQLCtiTGMP 
19741 - CATTTTCCTAAAACCACTCTGCAGAACAGCAGAAGTGATTGATGTCTGTGGTGGTTGGTA - 19800 
-HFPKTTLQNSRS D * CLWWLV 

- IFLKPLCRTAEVIDVCGGW* 

F S * NHSAEQQK*LMSVVVGR 
19801 - GAGAACATCAGCACCTGAGTTGCTAAAGTCATTTAGAGCCTTTGCTAAGTGGCAGCAAGC - 19860 
-EN I ST * V A K V I * SLC*VAAS 

- RTSAPELLKSFRAFAKWQQA 

BHQHLSC*SHIjEPLLSGSKL 
19861 - TGCTTCACGATAGCTGGTAGTATCTAAGGCTCCACTGAAATACTTGTACTTGTTATATAG - 19920 
-CFTIAGSI*GSTEILVLVI* 

- A S R * LVVSKAPLKYLYLLYR 

L)HDSW*YLRLH*NTCTCYIE 
19921 - AGCAAGATACCrGTTATACTGTGTAAGTGGCAACAGTGTCTCGCTACGCAATTTTAGGTA - 19980 
-SKIPVILCKWQQCLATQF*V 
-ARYLLYCVSGNSVSLRNF'RY 
QDTCYTV*VATVSRYAILGT 
19981 - CATTTCCTTGTTGAGCAAAAAGGTACACAAAGCAGCCTCCTCGAAGGTACTAAATGTAAC - 20040 
-HFLVEQKGTQS SLLE GTKCN 

- ISLLSKKVHKAASSKVLNVT 

FPC*AKRYTKQPPRRY*M*L 
20041 - TCCATTAAACATGACTCTTTTCCTAAGATAGTTGTTAAAGAACCAATGGCAGTGCTTCAG - 20100 
-S IKHDSFPKIVVKEPMAVLQ 
PLNMTLFLR* LLKNQWQCFR 
H*T*LFS*DSC*RTNGSASE 
20101 - AGAAATACAGAATACATAGATTGCTGTTATCCAAAAAGGCACAATAGGAGAAAACATGGC - 20160 
-RNTEYIDCCYPKRHNRRKHG 
-EIQNT*IAVIQKGTIGENMA 
KYRI HRLLLSKKAQ* EKTWQ 
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20161 - AAACCATTGAAGGTGAGCCAAGAATGAAACATCATTGGTGAAATAGAATGTCAAGTACAA - 20220 
-KPLKV SQE* Nil G E IECQVQ 
-NH*R*AKNETSLVK*NVKYK 
TIEGEPRMKHHW*NRMSSTS 

20221 - GTAAAAGACTGAGTAGACTCCCGGCAGAAAGCTGTAAGCTGGTACCAGACAGAGTATAGT - 20280 

- V K D * V DSRQKAVSWYQTEYS 

- * KTE*TPGRKL*AGTRQSIV 

KRLSRLPAESCKLVPDRV** 
20281 - GAAAGACATCAAAAACAAAAGTGCATTAGCAGCAACAACATGGTTGTACTCACCAAAAAC - 20340 
-ERHQKQKCI SSNNMVVLTKN 
-KDIKNKSALAATTWLYSPKT 
KTSKTKVH*QQQHGCTHQKH 
20341 - ACGTCTGAATTTCATAAAGTAGTAGGCAGCACAAGTCACCAATATGGCAATAATACCACC - 20400 
-TSEFHKVVGSTSHQYGNNTT 

- RLNFIK**AAQVTNMAIIPP 

V*IS*SSRQHKSPIWQ*YHQ 
204 01 - AGCCACTACTGAAGCAGACACATCTAAAGCACCCACAGGTTGCACAAGAGGAGTAAAGAT - 20460 
-SHY* SRHI*STHRLHKRSKD 
-ATTEADTSKAPTGCTRGVKM 
PLLKQTHLKHPQVAQEE*RC 
20461 - GTTAGCTATGAGATTCATCGCATCAACACCACAGAAAACTCCTGATAGAGCTCTGTAATG - 20520 
-VSYEI HRINTTENS * * S S V M 
~LAMRFIASTPQKTPDRAL*C 
* L * DSSHQHHRKLLIELCNA 
20521 - CTCATTATTAAGAACCCATCTACCACTGGTAGATAGGCAAATACCTACTTCTGACCTTTC - 2 0580 
~LIIKNPSTTGR*ANTYF*PF 
-SLLRTHLPLVDRQI PTSDLS 
HY*EPIYHW* IGKY1LLTFR 
20581 - GCATGTACCATGTCTACAGTACTCAGCATCAAAAGTTGTTACTACTCTAACAGAACCCTC " 20640 
-ACTMS TVLSIKSCYYSNRTL 
-HVPCLQYSASKVVTTLTEPS 
MYHVYSTQHQKLLLL*QNPP 
20641 - CAGGTAAGTGTTAGGAAACTGTATGATGGAACCATCCATAAGCACATAACGAGTGTGTGG - 20700 
-QVSVRKLYDGTI HKH I TSVW 
-R*VLGNCMMEPSIST*RVSG 
GKC*ETV*WNHP*AHNBCLD 
20701 - ACGAAGCTCACTATAAGAAATAGAACCCTCTAGCAAATTAGTGTCATAACAATATGGCAC - 20760 
-TKLTI RNRTL*QISVITIWH 
-RSSL*EIEPSSKLVS*QYGT 
EAHYKK*NPLAN * CHNNMAQ 
2 0761 - AGGTTTGCCCATAGCATCCTTAAAAATTGTACACTCAGCAGCAAGAACGCAAGCAGAGGT - 20820 
-RFAHS ILKNCTLSSKNASRG 
-GLPIASLKIVHSAARTQAEV 
VCP* HP^KLYTQQQERKQR* 
20821 - AGCAAAATCACTATACTCAATGAGTTTGGAAGGTGTGTAGCAAATGTTGCCAACAGCACT - 20880 
-SKITILNEFGRCVANVANST 
-AKSLYSMSLEGV*QMLPTAL 
QNHYTQ*VWKVCSKCCQQH* 
20881 - AAAAAC ACG AG G T AG AAAAT GC AAGAAG T C ACC AT T G A T TG CT CT C AGC AC A GT ACCC GG - 20940 
-KNTR* KMQEVTI DCSQHSTR 
-KTRGRKCKKSPLIALS TVPG 
KHEVENARSHH*LLSAQYPV 
20941 - TAAGCCAGGCACTATGAAACCAATCTCTCTTGTAATGATAGCAGCTACTACAGGGCAGCT - 21000 

- * ARHYETNLSCNDSSYYRAA 
-KPGTMKPISLVMIAATTGQL 

SQAL*NQSLL* * *QLLQGSF 
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21001 - TTTGTCATTTTTGTATGAACCACCACGCTGGCTAAACCATGCGTCAAAACCAGCATGTTT ~ 21060 
~ F V I F V * TTT LAKPCVKTSMF 
-LSFLYEPPRWLNHASKPACL 
CHFCMNHHAG*TMRQNQHVY 
210 61 - ATTTGCAAAACAATCATCAGTAGAAATGATGTCACGAGTGACACCATCCTGAATGGCTT^ - 21120 
-ICKT I ISRNDVTSDTI L N G F 
-FAKQSSVEMMSRVTPS * M A L 
LQNNHQ*K*CHE*HHPBWLC 
21121 - GTAACCAATGATTTCATTTGTGTAACCATCATGGATTGACAATGTATGTACTGGCATAAC - 21180 
-VTNDFICVTIMD*QCMYWHN 
-*PMISFV*PSWIDNVCTGIT 
NQ*FHIiCNHHGLTMYVLA*R 
21181 - GATATAACAAACCAATGCAGCAAGAACGCACAATAATGTGGCCTTAAGCATAAGTTTAAA - 21240 
-DI TN QCS KNAQ * CGLKHKFK 
~I*QTNAARTHNNVALSISLK 
YNKPMQQERTIMWP*A*V*N 
21241 - AC AAG T AC T AAC AAT CT TACC ACCC T T G AG TGAG AT T T TAGT AG T TAT G AC AT TG AC AAC - 21300 
-TSTNNLTTLE*DFSSYDIDN 

- Q V L T ILPPLSEILVVMTLTT 

KY*QSYHP*VRF**L*H*QP 
21301 - CTGTCTAGTTGTAGCACAAGTTAGTGTAAAAGGTATGTTGTTCTTCTTGGCAGCAGTACG - 21360 
-LSSCSTS*CKRYVVI>LGSST 
-CLVVAQVSVKGMLFFLAAVR 
V*L*HKLV*KVCCSSWQQYE 
21361 - AATTTGTTTACGCAGCTGTTCAGATAAAGACATGTAGTCTTTTACATTCCAGATGAGTGA - 21420 
-NIjFTQLFR*RHVVFYI P D E * 
-ICLRSCSDKDM* SFTFQMSE 
FVYAAVQIKTCSLLHSR*VK 
21421 - AACATTGTGACTTTTTGCTACTTGGGCATTGATATGCCTTGCATTACAGTCAATACATGC - 214 80 
-NIVTFCYIiGIDMPCITVNTC 

- TL*LFATWALICLALQSIHA 

HCDFLLLGH*YALHYSQYMR 
21481 ~ GCCAAGATCTCTGGGCGTCATGTTTTCAACCTTATTATAGGTGAGCATGAAATTGTTACA - 21540 

- A K 3 SGRHVFNLI IGEHE I V T 
-PRSLGVMFSTLL*VSMKLLQ 

QDLWASCFQPYYR*A*NCYN 
21541 - ACTGTCACCTGTCACTTCTAAGTCAGAGTGATGTGAAAGTTTGAGACATTCAATAACATC - 21600 
~TVTCHF*VRVM*KF ETFNNI 
-LSPVTSKSE*CESLRHSITS 
CHLSLLSQSDVKV*DIQ*HP 
21601 ~ CTTTGTGTCAACATCGGTATCAACAACACCTTGTCGGGCAGCTGACACGAATGTAGAAAG - 21660 
-LCVNIGINNTLSGS*HECRK 

- FVSTSVSTTPCRAADTNVER 

LCQHRYQQHLVGQLTRM*KG 
21661 - GACACCATCTAAAGCTACACCCTTTGCTAACTCGCTGTGAGCTGTAGCAACAAGTGCCTT - 21720 
-DTI*SYTLC*LAVSCSNKCL 
-TPSKATPFANSL*AVATSAL 
HHLKLHPLLTRCEL^QQVP* 
21721 - AAGTTTTTCCATAGGAACACTAAAAGTTGCTGAAAAGGTGTCGACATAAGCATCAAACAT - 21780 
-KFFRRNTKSC*KGVDIS I K H 

- S F S I GTLKVAEKVST*ASNI 

VFP*EH*KLIiKRCRHKHQTS 
21781 - CTTAACGGAAACTTCAGTACTATCTCCAACGTTTGATACAAGAGCTTGGTCAAGCAACAG - 21840 
-LNGNFST I S N V * YKSLVKQQ 
-LTETSVLSPTFDTRAWSSNR 
* RKLQYYLQRLIQELGQATE 
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21841 - AATAGGTTGGCACATCAGCTGACTGTAGTACACAGAAGCAGACTTAGAAGCAGACTCGTC - 21900 
-NRLAHQLTVVHRSRLRS RLV 
-IGWHIS*L*YTEADLEADSS 
*VGTSADCSTQKQT*KQTRR 
21901 - GCATTTGGACTTGCCATCAAAAACTATGACATTAATAGGCAGTGAACCTTTAGTGTTGTT - 21960 
-AFGLAIKNY DINRQ* T FSVV 
-HLDLPSKTMTLIGSEPLVLL 
IWTCHQKL*H**AVNL*CC* 
21961 - AGCTCTCAAATTGTCTAAATTGACAAAATGGGAGAGCGGATGTCTCTCATAGGTCTTTTG - 22020 

- S S Q I V * I DKMGERMSL I GLL 
-ALKLSKLTKWESGCLS*VF* 

LSNCLN*QNGRADVSHRSFD 
22021 - ACCAGCCTTGTCAAAGTAGAGGTGAAGCGCGCCATTTTTCACAGCAACACTATCAACAAT - 22 080 
-TSLVKVEVKRAI FHSNTINN 
-PALSK*R*SAPFFTATLSTI 
QPCQSRGEARHFSQQHYQQY 
22081 - ATACGATGACTGGTCAGTAGGGTTGATTGGTCTTTTAAACTGGAGTGACAAATCACGAGC - 22140 
-IR*LVSRVDWSFKLE* QITS 
-YDDWSVGLIGLLNWS DKSRA 
TMTGQ^G* LVF* TGVTNHEQ 
22141 - AACTTCATCACTAATGAATGTACTACCAGTGCAAAATGTGTCACAATTGAGACAATTCCA - 22200 

- N F I TNECTTSAKCVTIET I P 
-TSSLMNVLPVQNVSQLRQFQ 

L H H * *MYYQCKMCHN* DNSN 
22201 - ATTGTGAGTCTTGCAGAAGCCACGGCCTCCATTTGCATAGACATAGAAAGATCTCTTCAT - 22260 
-IVSLAEATASIC I D I ERSLH 
-L*VLQKPRPPFA*T*KDLFM 
CESCRSHGLHLHRHRKISSC 
22261 - GCCATTAACAATAGTTGTACACTCAACGCGTGTGGCACGATTGCGCTTATAGCACATCAT - 22320 
-AINNS CTLNACGTI ALIAHH 
-PLTIVVHSTRVARLRL*HIM 
H*Q*LYTQRVWHDCAYSTSC 
22321 - GCAAGTCGAAGAGGTGCAACCATCCATGATATGAACATAGCTCTTCCATATGTAGTAGAA - 22380 
-ASRRGAT IHDMNIALPYVVE 
-QVEEVQPSIX1I*T*LFHM**K 
KSKRCNHP*YEHSSSICSRK 
22381 - AGAAGCA AAGAAGAT GT AC AT CCT AACC ATTGCAGA AACG GG TGCC AT T TG TACAAT ACT - 224 40 
-RSKEDVH PNHCRNGCHLYNT 
-EAKKMYILTIAETGAI CTIL 
KQRRCTS* PLQKRVPFVQY* 
22441 - AATGATAAACCACATGAGCCAAGAATTGCTGATGAAATGACTAGCAAAATAGCCAAAGAA - 22500 
-NDKPHEPRIADEMTSKIAKE 
-MINHMSQELLMK*LAK*PKN 
** TT*AKNC**ND*QNSQRT 
22501 - CACCTGCATTATAGCTGAAAGACCTAATAAATAAAAGAATTTTGTGAACAACATATATGC - 22560 
-HLHYS*KT**IKEFCEQHIC 
-TCI IAERPNK*KNFVNNIYA 
PAL*LKDLINKRIL* TTYMP 
22561 - CAAAACCCACTCAGCGGCCAGACCTAAAATTGTCAAGTCTAGCTTGTACGATGAAATCGT - 22620 
-QNPLS6QT*NCQV*liVR*NR 
-KTHSAARPKIVKSSLY DE IV 
KPTQRPDLKLSSLACTMKSS 
22 621 - CACCTGAATGGTTTCAAGAGCTGGATAAGAATCAAGGGAGTCTAATCCACTTAAACAAAT - 22 680 
-HLNGFKSWI RIKGV* ST* T N 

- T*MVSRAG*ESRESNPLKQM 

PEWFQELDKNQGSLI HLNKC 
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22 681 - GCTGCAAGGAAAAGAACCTTCACAGAAATCCATAGTAGTAACGTTAGACGAATTAAGATA - 22740 
-AARKRTFTEIHSSNVRRIKI 
-LQGKEPSQKSIVVTLDELRY 
CKEKNLHRNP* * * R * T N * DT 
22741 - CAATTCTCTAACGCCATTACAATAAGAAGGAGCACCAAAATTAGATAAGAGTACACCAAA - 22800 
-QFSNAIT IRRSTKIR*EYTK 
-NSLTPLQ* EGAPKLDKSTPK 
I L * RHYNKKEHQN* I RVHQK 
22801 - AGCAGCAGTTACACAGATTAGAGAACCTAAGCAAATACTTAACAACAATAGCCACATAGC - 22860 
-SSSYTD*RT*ANT*QQ* PHS 
-AAVTQIRE PKQILNNNSHIA 
QQLHRLENLSKYLTTIAT*R 
22 8 61 - GATTGTGAACAATTTAGAAAATTTGGGTGACTTCACATA^vTTAATGCCGGCATCCAAACA - 22920 
-DCEQ FRKFG* LH I I NAG IQT 
-IVNNLENLGDFT* LMPASKH 
L*TI*KIWVTSHN*CRHPNI 

22 921 - TAATTTAGCAACACTCTTAACACTATTTTTAGCAATAGTTGTAGGTAGTGAAGCTCTAAT - 22 980 

-*FSNTLNTI FSNSCR* * SSN 
-NLATLLTLFLAIVVGSEALI 
I*QHS*HYF*Q*L*VVKL*F 
22981 - TCTAGAATTGGTACTTTTAGTAAAAGTACACAATTGGAACAATAATGTAAACACATAAGG - 23040 
-SRIG~TFSKSTQLEQ*CKHIR 
-LELVLLVKVHNWNNNVNT * G 
* N W Y F * *KYTIGTIM*THKA 
23041 - CATATAATTGTTAAACACACGTTGTGCTAATCTCTTAGCGCAATTTGATGTTGTAATTGC - 23100 
-HIIVKHTLC*SIiSAI*CCNC 

- I *LLNTRCANLLAQFDVVIA 

YNC*THVVLIS*RNLML*LL 
23101 - TGCTTGTCCTAAGAATGGTTTGACATAAGCCAAAATTTTACTCCAAGGAACACTATTAAT - 23160 
-CLS*EWFDISQNFTPRNTIN 
-ACPKNGLT *AKILLQGTLLI 
LVLRMV* HKPKFYSKEHY*L 
23161 - TGCAGCAATACCATGAGTGGCAATTGTTTTTAAACCTAAGGCTAGTGAAAGCTCATTAGG - 23220 
-CSNTMSGNCF*T*G**KI»IR 
-AAIP*VAIVFKPKASESSLG 
QQYHEWQLFLNLRLVKAH*V 
23221 - TTTCTTAATGGTAATGCTTGTGTTTTCCACATAAGCAGCCATAAGATCCTCATGACCTAA - 23280 
-FLNGNACVFHISSHKILMT* 
-FLMVMLVFST*AAIRSS* PN 
S*W*CLCFPHKQP*DPHDLT 
23281 - CTCTTGTGTTACTTTAACACCTTCATCTGATGGTTTAAGTATGACATTGCCTACAACTTC - 23340 
-LLCYFNT FI * WFKYDIAYNF 
-SCVTLTPSSDGLSMTLPTTS 
LVLL*HLHLMV*V*HCLQLR 
23341 - GGTAGTTTTCACGTCACACTCTATGACTTCCTTCTGTATGGTAGGATTTTCCACTACTTC - 23400 
-GS FH.VTLY D FLLYGRI FHYF 
-VVFTSHSMTSFCMVGFSTTS 
*FSRHTL*LPSVW*DFPLIi-L 

23 4 01 - TTCAGAGGTGGGTTGTTGACTTTCACAAGCAAGATTGTCCATTCCTTGTGTGTCTTCTAC - 234 60 

-FRGGLLTFTSKIVHSLCVFY 
-SEVGC*LSQARLSIPCVSST 
QRWVVDFHKQDCPFLVCLLIi 
234 61 - TGCCAGAACTTCAAATGAATTTGAAGTATCTACTGGCTTTGTACTCGAAAGACAACGTAA - 23520 

- C Q N F K * I * S IYWLCTPKTT* 
-ARTSNEFEVSTGFVLQRQRK 

PELQMNLKYLLALYSKDNVN 
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23521 - ACACCAAGTGTTTGGTTTGAACGTTGTCTTGGTTGTAGCCTGGTTAATGTGCCAAACAAT - 23580 
-TPSVWFERCLGCSLVNVPNN 
-HQVFGLNVVLVVAWLMCQTI 
T K C L V * T L S W L * PG*CAKQL 

23581 - TGGCTTATGCAGTAATTTAGCACCTTTCTTGAAACTCGCTGAATAGTGTCTATAGTCAAT - 23640 

- W L M Q * FSTFLETR* IVS I V N 
-GLCSNLAPFLKLAE*CL*SI 

A Y A V I *HLS*NSLNSVYSQ* 
23641 - AGCCACTACATCGCCATTCAAGTCTGGGAAGAATGTGACAGATAGCTCTCGTGAAGCTGG - 23700 
-SHYIAIQVWEECDR*LS*SW 
-ATTSPFKSGKNVTDSSREAG 
PLHRHSSLGRM* QIALVKLA 
23701 - CTTTGTGAAGCCTGTCATTTGATTTAAATCATCAGCAAATTTTGTGTTAGAACATGTGAG - 23760 
-LCEACHLI * I ISKF CVRTCE 
-FVKPVI* FKSSANFVLEHVS 
X>*SLSFDLNHQQILC*NM*V 
23761 - TTTGAAATTATCAAAACTCGCATTTGGTAATGGTTGAGTTGGTACAAGGTCTATAGGCTG - 23820 

- F E I IKTRI W * WLSWYKVYRL 
-LKLSKLAFGNG* VGTRSIGC 

*NYQNSHLVMVELVQGL*AA 
23821 - CTCTGTATAGTAAGCATTATCCTTTTTATAATACCCATCCAATTTTGGTTCAATCTCTGT - 23880 
-LCIVSIILFIIPIQFWFNLC 
-SV**ALSFL*YPSNFGSISV 
LYSKHYPF YNTHPILVQSLC 
23881 - GTAAGTAACTCCATCGAGTTTATACGACACAGGCTTGATGGTTGTAGTGTAAGATGTTTC - 2394 0 
-VSNS IEFIRHRLDGCSVRCF 

- *VTPSSLYDTGLMVVV*DVS 

K*LHRVYTTQA*WL* CKMFP 
23941 - CTTGTAGAAAACATCAGTCACTGGTCCTTTGTACTCTGACATCTTTGTAAGGTGAGCTCC - 24000 
-LVENI SHWS FVL* HLCKV'S S 
-L*KTSVTGPLYSDIFVR*AP 
CRKHQSLVLCTLTSL*GELR 
24001 - GTCAATACGATAGAGGGTCTCCTTAGCAGTTATATGAGTGTAATGACCACACTGATAGTT - 24060 
-VNT IEGLLS SYMSVMTTLIV 
-SIR*RVSLAVI*V**PH**L 
QYDRGSP*QLYECNDHTDSY 
24 061 - ACCAGTGTACTCATTCGCACATAAGAATGTACCTTGCTGTAATTTATACTCAGCAGGTGG ~ 2 4120 
-TSVLI RT * E C T L L *- F I LSRW 

- PVYS FAHKNVPCCNLYSAGG 

QCTHSHIRMYLAVIYTQQVV 
24121 - TGCAGACATCATAACAAAAGAAGACTCTTGTTGTACTAGATATTGTGTAGCATCACGACC - 24180 
-CRHHNKRRLLLY* ILCS ITT 
-ADIITKEDSCCTRYCVASRP 
Q T S * QKKTLVVLDIV*HHDH 
24181 - ACACACACATGGAATGGAAACACCTGTCTTAAGATTATCATAAGATAGAGTACCCATATA - 24240 
-THTWNGNTCLKI I X R * S T H I 
-HTHGMETPVLRLS* DRVPIY 
THMEWKHL3* DYHKIEYPYT 
24241 - CATCACAGCTTCTACACCCGTTAAGGTAGTAGTTTTCTGACCACAATGTTTACACACCAC - 24300 
-HHSFYTR* GSSFLTTMFTHH 
-ITASTPVKVVVF*PQCLHTT 
SQIiLHPLR**FSDHNVYTPH 
24301 - ATTAAGAACTCGCTTTGCAGATTCCAAATTAGCATGCTGTAGAAGATGGGTCATAGTTTC - 24360 
-IKNSLCRFQI S M L * KMGHSF 
-LRTRFADSKLACCRRWVIVS 
*ELALQI PN*HAVEDGS*FL 
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24361 - TCTGACATCACCAAGCTCGCCAACAGTTTTATTACTGTAAGCGAGTATGAGTGCACAAAA - 24420 

- S D I TKLANS FITVSEYECTK 

- LTSPSSPTVLLL*ASMSAQK 

*HHQARQQFYYCKRV*VHKS 
24 421 - GTTAGCAGCATCACCAGCACGGGCTCTATAATAAGCCTCTTGAAGTGCTGGTGCATTGAA - 24480 

- V S S I TSTGS I ISL.LKCWCIE 
-LAAS P A R A L * * A S * SAGALN 

*QHHQHGLYNKPLEVLVH*I 
24481 - TTTGACTTCAAGCTGTTGAAGTGCTAATAAAACACTAGACAAATAACAATTGTTATCAGC - 24540 
-FDFKLLKC**NTRQITIVIS 

- L T S S C * SANKTLDK*QLLSA 

*LQAVEVLIKH*TNNNCYQP 
24541 - CCATTTAATTG7VAGTTAAACCACCAACTTGAGGAAATTTCCATTTCTTTGTGTGGTTTAA - 24600 
-PFN*S*TTNLRKFPFLCVV* 
~HLIEVKPPT*GNFHFFVWFK 
I*LK-LNHQLBEISISLCGLK 
24 601 - AGCAGACATGTACCTACCAAGAAAACTCTCATCAAGAGTATGGTAGTACTCGAAAGCTTC - 24660 
-SRHVPTKKXLIKSMVVLESF 
-ADMYLPRKLSSRVW*YSKAS 
QTCTYQENSHQEYGSTRKLH 
24 661 - ACTACGTAGTGTGTCATCACTAGGTAGTACAAAGAAAGTCTTACCCTCATGATTTACATG - 24720 
-TT*CVITR*YKESL TLMIYM 
-LRSV SSLGSTKKVLPS * FT* 
YVVCHH*VVQRKSYPHDLHE 
24721 - AGGTTTAATTTTTGTAACATCAGCACCATCCAAGTATGTTGGACCAAACTGCTGTCCATA - 2478 0 
-RFNFCNI ST IQVCWTKLLS I 
-GLIFVTSAPSKYVGPNCCPY 
V* F L * HQHHPSMLDQTAVHM 
24 781 - TGTCATAGACATATCCACAAGCTGTGTGTGGAGATTAGTGTTGTCCACAGTTGTGAACAC - 2 4 840 
-CHRHIHKLCVEISVVHSCEH 
-VIDISTSCVWRLVLSTVVNT 
S*TYPQAVCGD*CCPQL*TL 
24841 - TTTTATAGTCTTAACCTCCCGCAGGGATAAGAGACTCTTTAGTTTGTCAAGTGAAAGAAC - 24900 
-FYSLNLPQG*ETL + FVK*KN 
-FIVLTSRRDKRLFSLS SERT 
L*S*PPAGIRDSLVCQVKEP 
24 901 - CTCACCGTCAAGATGAAACTCGACGGGGCTCTCCAGAGTGTGGTACACAATTTTGTCACC - 24960 
-LTVKMKLDGALQ SVVHN FVT 
- V SPSR*NSTGLSRVWYTILSP 
HRQDETRRGSPECGTQFCHH 
24961 - ACGCTTAAGAAATTCAACACCTAACTCTGTACGCTGTCCTGAATAGGACCAATCTCTGTA - 25020 
-TLKKFNTT * L C T L S * IGPISV 
-RLRNSTPNSVRCPE* D Q S L * 
A*EIQHLTLYAVLNRTNLCK 
25021 - AGAGCCAGCCAAAGAAACTGTTTCTACAAAGTGCTCCTCAGATGTCTTTGATGACGAAGT - 25080 
-RASQRNCFYKVLLRCL* * R S 
-EPAKETVSTKCSSDVFDDEV 
SQPKKLFLQSAPQMSLMTK* 
25081 - GAGGTATCCATTATATGTAGTAACAGCATCTGGTGATGATACTGACACTACGGCAGGAGC - 25140 
'-EVSIICSNSIW**Y*HYGRS 
-RYPLYVVTASGDDTDTTAGA 
G I H Y M * ^QHLVMILTLRQEL 
25141 - TTTAAGAGAACGCATACAGCGCGCAGCCTCTTCAAGATTAAAACCATGTGTCACATAACC - 252 00 
-FKRTHTARS LFKIKTMCH I T 
-LRERIQRAASSRLKPCVT* P 
*ENAYSAQPLQD*NHVSHNQ 
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25201 - AATTGGCATTGTGACAAGCGGCTCATTTAGAGAGTTCAGCTTCGTAATAATAGAAGCTAC - 25260 
-NWHC DKRLI *RVQLRNNRS Y 
-IGIVTSGSFREFSFVI I E A T 
L A L * QAAHLESSAS* * * K L Q 

252 61 - AGGCTCTTTACTAGTATAAAAGAAGAATCGGACACCATAGTCAACGATGCCCTCTTGAAT - 25320 

- R L F T S IK EES DT I V N DALLN 
-GSLLV*KKNRTP*STMPS*I 

A L Y * YKRRIGHHSQRCPLEF 
25321 - TTTAATTCCTTTATACTTACGTTGGATGGTTGCCATTATGGCTCTAACATCCATGCATAT - 25380 
-FNSFILTLDGCHYGSNIHAY 
-LIPLYLRWMVAIMALTSMHI 
*FLYTYVGWI*PLWL*HPCI* 
25381 - AGGCATTAATTTTCTTGTCTCTTCAGCATGAGCAAGCATTTCTCTCAAATTCCAGGATAC - 254 40 
-RH * FSCLFSMSKHFSQ I PGY 
-GINFLVSSA*ASISLKFQDT 
A L I FLSLQHEQAFLSNSRIQ 
25441 - AGTTCCTAGAATCTCTTCCTTAGCATrAGGTGCTTCTGAAGGTAGTACATAAAATGCAGA - 25500 
-SS*NLFLSIRCF*R*YIKCR 

- V P R I SSLALGASEGST*NAD 

FLESLP*H*VLLKVVHKMQI 
25501 ~ TTTGCATTTCTTAAGAGCAGTCTTAGCTTCCTCAAGTGTATAACCAGCACATCCTTGTCC - 255 60 
-FAFLKSSLSFLKCI TSTSLS 

- LHFLRAVLASSSV* PAHPCP 

CIS*EQS*LPQVYNQHIIiVQ 
25561 - AGGGTACGTGGTTATATACTCATCAACTGGCACTTTCTTCAAAGCTCTTGAGAGCATCTC ~ 25620 
-RVRGY IJjlNWHFLQSS * EHL 
-GYVV IYSSTGTFFKALESIS 
GTWLYTHQLALSSKLLRASQ 
25621 - AGTAGTGCCACCAGCCTTTTTGGAGGGTATTACAACACAAGTGATATCACCACTAGTGAT - 25680 

- S SAT S LFGGYYNT S D I T TS D 
-VVPPAFLEGITTQVISPLVI 

* CHQPFWRVIiQHK*YHH* * * 
25681 - AACATCACCTACCATGTAAGGTGCATCCTTCTCAAGGAAAGACATATCTTCACCTCTAAG - 25740 
-NITYHVRCILLKERHI FTSK 
-TSPTM^GASFSRKDISSPLS 
HHLPCKVHPSQGKTYI*HL*A 
25741 - CATGTTCTGAGAATCATGGTAAAGCTTACCATTGATATCAGCAAACAAGAGTAACrTATT - 25800 
-HVLR3MVKLTIDI SKQE * LI 
-MF*ESW*SLPLISANKSN1jL 
CSENHGKAYH*YQQTRVTYW 
25 8 01 - GGTAAGAAACTTAGTTTCTTCCAGTGTTGTGGTAACCTCATCAATGCAGGCCTTAATTTT - 25860 
-GKKLSFFQCCGNLINAGL.NF 
-VRNLVSSSVVVTSSMQALIF 
*ET*FLPVLW*PHQCRP*FL 
25861 - TGGCTTCACATCGACAGGCTTCTGTACGACAGATTTCTCCTCAGTTTTGGAATCTTCTGT - 25920 

- W L H I DRLLYDRFLLSFGIFC 
-GFTSTGFCTTDFSSVLESSV 

ASHRQASVRQIS PQFWNLLC 
25921 - GTTTGGTGGCTCCTCTTGTTTAGGTGCTTCCACTCTAGGCTTCAGGTTATCAAGATAATC - 25980 
-VWWLLLFRCFHSRLQVIKII 
-FGGSSCLGASTLGFRLSR*S 
LVAPLV*VLPL*ASGYQDNP 
25 981 - CATGACAACCTGCTCATAAAGAGCTTTGTCATTGACTGCAATATAAACCTGTGTACGAAC - 2 604 0 
-HDNLLIKS FVIDCNINLCTN 
-MTTCS*RALSLTAI*TCVRT 
*QPAHKELCH*LQYKPVYEP 
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26041 - CGTCTGCACGCACACTTGTAAAGACTGAAGTGGTTTAGCACCAAATATGCCTGCTGACAA - 26100 
-RLHAHL *RLKWFSTKYAC*Q 
-VCTHTCKD* SGLAPNMPADN 
SARTLVKTEVV* HQICLLTT 
26101 - CAATGGTGCAAGTAAGATGTCCTGTGAATTGAAATTTTCATATGCTGCCTTAAGAAGCTG - 26160 
-QWCK*DVL* IEIFICCLKKL 
-NGASKMSCELKFSYAALRSW 
MVQVRCPVN*NFHMLP*EAG 
26161 - GATGTCCTCACCTGCATTTAGGTTAGGTCCAACAACATGCAGACACTTCTTAGCAAGATT - 26220 
~ D V L T C I *VRSNNMQTLLSKI 
-MSSPAFRLGPTTCRHFLARL 
CPHLHLG^VQQHADTS^QDY 
26221 - ATGTCCAGAAAGCAAACAAGACCCTCCTACTGTAAGAGGGCCATTTAGCTTAATGTAATC - 26280 
-MSRKQTRPSYCKRAI * L N V I 
-CPESKQDPPTVRGPFSLM*S 
VQKANKTLLL*EGHLA*CNH 
26281 - ATCACTCTCCTTTTGCATGGCACCATTGGTTGCCTTGTTGAGTGCACCTGCTACACCACC - 26340 
-ITLLLHGTIGCLVE1CTCYTT 
-SLSFCMAPLVALLSAPATPP 
HSP FAWHHWLPC*VHLLHHH 
26341 - ACCATGTTTCAGGTGTATGTTAGCAGCATTTACAATCACCATAGGATTAGCACTTTGTGC - 26400 
-TMFQVYVSS IYNHHRI STLC 
-PCFRCMLAAFTITIGLALCA 
HVSGVC*QHLQSP*D*RFVP 
2 64 01 - CTCCTTAACGATGTCAACACATTTAATGGCAACATTGTCAGTAAGTTTTAAATAACCAGT - 264 60 
-LLNDVNTFNGNIVSKF* ITS 
-SLTMSTHLMATIjSVSFK* PV 
P*RCQHI*WQHCQ*VLNNQ* 
2 64 61 - AAACTGATTAACTGGTTCTTCAGGTGTAGGTTCTGGTTCTGGCTCAATCTCTGATTGCTC - 26520 
-KLINWFFRCRFWFWLNL*LL 
-N*LTGSSGVGSGSGSISDCS 
T D * LVLQV*VLVLAQSLIAQ 
26521 - AGTAGTATCATCCAGCCAGTCTTCCTCTTCT.TCTTCCTCAACTCGAACTGTTTCAGCTGA - 2 6580 
-SSIIQPVFLFFFLNSNCFS* 
-VVSSSQSSSSSSSTRTVSAE 
* YHPASLPLLLPQLELFQLR 
26581 - GGCACCAAATTCCAGAGGGAGACCTTGATAATCATCCTCTGTACCGTACTCATGTTCACA - 26640 
-GTKFQRETLI II LCTVLMFT 
-APNSRGRP**SSSVPYSCSQ 
HQIPEGDLDNHPLYRTHVHR 
2 6641 - GGTTTCATCAATTTCTTCTTCCTCACACTCTGCATCGTCCTCTTCTTCCTCATCTGGAGG - 26700 
-GFINFFFLTLCIVLFFLIWR 
-VSSISSSSHSASSSSSSSGG 
FHQ FLLPHTTjHRPLLPHLEG 
26701 - GTAAAAGGAACAATACATACGTGATGAAAAGTTTTCTTCACCAGCATCATCAAATAAGTA - 267 60 
-VKGTIHT**KVFFTSIIK*V 
-*KEQYIRDEKFSSPASSNK* 
KRNNTYVMKSFLH-QHHQI SR 
267 61 - GAATGTAGCTACACTCCACTCATCAAGATCAATACCCATGTTGGTAAGGAGATCAGAAAC - 2 6820 
-ECSYTPLIKINTHVGKEIRN 
-NVATLHSSRSIPMLVRRSET 
M*LHSTHQDQYPCW* GDQKL 
26821 - TGGTTGTAAAGTCTTCACAACAGCCTCTGCTACAACACATGCAAACTCAGTAACTTCGGT - 26880 
~ W L * SLHNSLCYNTCKLSNFG 
-GCKVFTTASATTHANSVTSV 
VVKSSQQPLLQHMQTQ*LRY 
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26881 - ACCGGATTCAACAGTGTAGACAGAGCACTTTTCATTAAGCACTTTGTCAACACGTTCATC - 26940 
-TGFNSVDRALFI KHFVNTFI 
~PDSTV*TEHFSLSTLSTRSS 
RXQQCRQSTFH*ALCQHVHQ 
2 6941 - AAGCTCAAATGTGATTCTCACATTCTTGTAACCTTGAACTTCCCAAACAGTATCTTCTCC - 27 000 
-KLKCDSH ILVTLNFPNS I FS 
-SSNVILTFL*P*TSQTVSSP 
A Q M * FSHSCNLELPKQYLLQ 
27001 - AAAGGTTACACCTTTAATTGGTGCACCCCCTTTTAAGCG^lAAGACATTGTTTGTAGCCAG - 27060 
-KGYTFNWCT PF*AKDIVCSQ 
-KVTPLIGAPPFKRKTLFVAS 
RLHL* LVHPLLSERHCL* PV 
27061 - TAAACCAGGAGACAATGCGCAGTATTGTTCTTTGTCCTTAATCTCTAAGAGCATGAGGCC - 27120 

- * TRRQCAVLFFVLNL* EHEA 

- KPGDNAQYCSLSLISKSMRP 

NQETMRSIVLCP * S L R A * G H 
27121 - ATTTACACAGACTGGTGTGCCGACGATAGCTCCATTTGTGAAGCTATCAACGGGCGTCTC - 27180 
-IYT DWCADDSS I CEAI NGRL 
-FTQTGVPTIAPFVKLSTGVS 
LHRLVCRR^LHL* SYQRASR 
27181 - GAGTGCTTCGAGTTCACCGTTCTTGAGAACAACCTCCTCAGAGGTAAGTACTGTGTCATG - 27 24 0 
-ECFE FTVLENNLLRGKYCVM 

- SASSS. PFLRTTSSEVSTVSC 

VLRVHRS*EQPPQR*VLCHV 
27241 - TGAATCACCTTCAAGAAAGGTTACTTCTTTTGGTGCCTTAAGAGGCATGAGTAGTTGCAG - 27300 

- * I TFKKGYFFWCLKRHE*LQ 
-ESPSRKVTSFGALRGMSSCS 

NHLQERLLLLVP*EA*VVAA 
27301 - CTGCTCCTTGCCACGTATACACTGACGGTAAAGTCCCTTGCTTTGAGCGATGAAGACTTC - 27360 
-LIiLATYTLTVKS LALS DEDF 
-CSLPRIH*R*SPLL*AMKTS 
APCHVYTDGKVPCFER*RLH 
27361 - ACCTAAGTTGAGTGATCGCAACTTTGCGCCAGCGATAGTGACTTGATCAATGCACATTTC - 27420 

- T * V E * SQLCASDSDLINAHF 
-PKLSDRNFAPAIVT* SMHIS 

LS*VIATLRQR* *LDQCTFR 
27421 - GAGTGCCTTGTTAACAACATCAATGAAGCATTTTACACAATCCTTGATGTTATCTGAAGC - 27480 
-ECLVHNINEAFYTILDVI*S 
-SALLTTSMKHFTQSLMLSEA 
VPC*-QHQ*SILHNP*CYLKQ 
27481 - AACCTGTATTTGACCCTTGACGATGTCAAAAACACCTGT^iATGAGAAATTTGAGAATCTC - 27540 

- N L Y LT LDDVKN TCWEKFENL 
-TCI^PLTMSKTPVMRtslLRIS 

PVFDP*RCQKHL**EI*BSSP 
27541 - CCAAGCATCCTTGAGAAATTCAACTCCTGCACTAAGTTTCGCCTCAATCCATTCAAAGAT - 27600 
-PS I LEKFNSCTKFRLNPFKD 
-QASLRNSTPALS FASI HSKI 
KHP*EIQLLH*VSPQSIQR* 
27601 - AGGCCTGAGTTTTTCAACAGTAGTGCCCAAAAGATTAGACAACCACTGAGAAGTCTGTTG - 27660 
-RPEFFKSSAQKIRQPLRSLL 
-GLSFSTVVPKRLDNH*EVCC 
A*VFQQ*CPKD* TTTEKSVV 
27 661 - TACAAGACCACCAGTTACATATGCCATAATAATGACACTGTTGGTGAGCAGGTCTGAAGT - 27720 
-YKTTSYICHNNDTVGEQV* S 
-TRPPVTYAIIMTLLVSRSEV 
QDHQLHMP** *HCW*AGLKY 
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27721 - ATAAACCATGGCGTCGACAAGACGTAATGACTGTTCAGAAATACCATCAAGTATGGTGAC - 27780 
-INHGVDKT* *LFRNTIKYGD 

- * TMASTRRNDCSEIPSSMVT 

KPWRRQDVMTVQKYHQVW*Q 
277 81 - AGCTGCTCTTTGCAAATCAGGAATTGAGTGGTTTGCTGCATCAAGTGTGCGCGCAAAAAT - 27840 
-SCSLQIRN*VVCCIKCARKN 
-AALCKSGIEWFAASSVRAKI 
LLFANQELSGLLHQVCAQKL 
27841 - TGATCTGATAACACCAGCAGCCTGTGAGGGAAAACCACACAGTGGTGTTAAAACTGATCT - 27 900 

- * S DNTSSL* GKTTQWC*N* S 
-DLITPAACEGKPHSGVKTDL 

I**HQQPVRENHTVVLKLIS 
27 901 - CTGTTGTCCAATGTTCCAAGCACCTTTTACGGGCTTTCCCTTGGTAACTTTATAGTTACC - 27 960 

- L L SNVPSTFYGLSLGN F I V T 
-CCPMFQAPFTGFPLVTL*LP 

VVQCSKHLLRAFPW* LYSYR 

27 961 - GCAGGACTCAACAATGGTTTTGAAAGACTTGTAATCAAGACTCTTTATAGTGTCAATAAA - 28020 

-AGLNNGFERLVI KTLYSVNK 
~QDSTMVLKDL*5RLFIVS IK 
RTQQWF*KTCNQDSL*CQ*R 
28021 - GGCACTTGTAGAAGCAGAGAAAGATGCCAAAATGAT GGCAACCTCTTCATTCAAATGAAA - 28080 
-GTCRSRERCQNDGNLFIQMK 
-ALVEAEKDAKMMATS S F K * K 
HL*KQRKMPK*WQPLHSNEN 
28081 - ATCGCCAACAATGTTAATGTTAACACGTTCACGACTCAGTATCTCAAGGAGATCCTCATT - 28140 
-IANNVNVNT FTTQYLKEILI 

- S PTMLMLTRSRLiS ISRRSSF 

RGQC*C*HVHDSVSQGDPHS 
28141 - CAAGGTCTCCACATTGTCACCAGTAATGCCAGTATGGCCTGAGCCAATATCAGCACTAGC - 28200 
-QGLHIVTSNASMA*ANISTS 
-KVSTLSPVMPVWPEPISALA 
RSPHCHQ*CQYGLSQYQH*H 
2 8201 - ACGAGGAACCCAGTAGGCACGCTTATTATAGCAGCCAACATAGGCAAACACACAGCCTCC - 2 82 60 
-TRNPVGTLI IAANIGKHTAS 
-RGTQ*ARLL*QPT*ANTQPP 
EEPSRHAYYSSQHRQTHSLQ 
28261 - AAAACATCTAGTCCTACCTCCCTTGCGGAGTCGAGTTTCAATGT-TTGAGTGGTTGTGATA - 28320 
-KTSSPTSLAESSFNV*VVVI 
-KHLVLPPLRSRVSMFEWL** 
N I * SYLPCGVEFQCLSGCDN 

28 321 - ATCTGCAACACTATGCTCAGGTCCAATCTCTGGGTCTTGACAGGCAGGACATGGCATTTT - 28380 

-I CNTMLRSNLWVLTGRTWHF 
-SATLCSGPI SGS*QAGHGIF 
LQHYAQVQSLGLDRQDMAFS 
2 8 381 - CACTACAGCATTAGTAGGTAGGTACCCACATGTAGTAGGTCCTTCAATAACTAAATTTTC - 28440 
-HYSISR*VPTCSRSFNN*IF 
-TTALVGRYPHVVGPS ITKFS 
LQH**VGTHM**VLQ*LNFQ 
28 441 - AGTGCCACAATGTTCACAAGTGGCTTTCAGAAAGTCGCACGTCTGCCATGAAACTTCATC - 28500 
-SATMFTSGFQKVARLP * N F I 
-VPQCSQVAFRKSHVCHETSS 
CHNVHKWLSESRTSAMKLHR 
28501 - GCAATGATTACATTTCATCAAGGTAGACAAGTGCATATTGTTACACTCCTGTGGAGATGC - 28560 
-AM ITFHQGRQVH IVTLLWRC 

- Q * LHFIKVDKCI LLHSCGDA 

NDYISSR*TSAYCYTPVEMQ 
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28561 - AACAGGGTACACAGAGCGTATACGCCCCATGAAACCCTCAGTCTTTTTCTTTTCAACACG - 28620 
-NRVHRAYTPHETLSLFLFNT 
-TGYTERIRPMKPSVFFFSTR 
QGTQSVYAP*NPQSFSFQHV 
28621 - TGGTTGAATGACTTTGACTTTTGAGTTAAGAGGAAACACAAACTTTGGGCATTCCCCTTT - 28680 
-WLNDFDF*VKRKHKLWAFPF 
-G*MTLTFELRGNTNFGHSPIi 
VE*L*LLS*EETQTLGIPL* 
28681 - GAAAGTGTCAAATTTCTTGGCACTCTTAATTTCGAAGGG.7GTCTGGTGCTCGTAGCTCTT - 28740 
-ESVKFLGTLNFEGCLVLVAL 
-KVSNFLALLISKGVWCS*LL 
KCQISWHS* FRRVSGARSSY 
28741 - ATCAGAGCGCTCAGTGAACCAGGCAATTTCATGCTCATGGTCACGGCAGCAGTAGACACC - 28800 
-I RALSEPGN FMLMVTAAVDT 
-SERSVNQAISCSWSRQQ*TP 
QSAQ*TRQFRARGHGSSRHL 
28 8 01 - TCTCTTCGACTCGATGTAATCAAGTTGTTCGGAAAGAGTGCACATTGACTTGCCCGCGCG - 28 8 60 

- S LRLDVIKL FGKSAH* LARA 
-LFDSM^SSCSERVHIDLPAR 

SSTRCNQVVRKECTLTCPRV 
28861 - TGCGAGAAAATCTTTGATGCAATCAAGAGGGTACCCATCTGGGCCACAGAAATTGTTGTC - 28 920 
-CEKIFDAIKRVPIWATEIVV 
-ARKSLMQSRGYPSGPQKLLS 
RENL*CNQEGTHLGRRNCCR 
28 921 - GACATAGCGAGTGACTGCACCTCCATTGAGCTCACGAGTGAGTTCACGGAGTGCACCACT - 28980 
-DIASDCTSI ELTSEFTECTT 
-T*RVTAPPLSSRVSSRSAPL 
HSE*LHLH*AHE*VHGVHRC 
28981 - GCCATGCTTAGTGTTCCAGTTTTGTTCATAATCTTCAATGGGATCAGTGCCAAGCTCGTC - 29040 
-AMLSVPVLF I IFNGISAKLV 
~PCLVFQFCS*SSMGSVPSSS 
HA*CSSFVHNLQWDQCQARH 
29041 - ACCTAAGTCATAAGACTTTAGATCGATGCCATAGCTATGACCACCGGCTCCCTTATTACC - 29100 
-T*VIRL* I DAIAMTTGSLIT 

- PKS* D F R S M P * L * PPAPLLP 

LSHKTLDRCHSYDHRLPYYR 
2 9101 - GTTCTTACGAAGAAGAACATTGCGGTATGCAATTGGGGTTTCGCCCACATGTGGCACGAG - 2 9160 
-VLTKKNIAV CNW GFAHMWHE 
-FLRRRTLRYAIGVSPTCGTS 
SYEEEHCGMQLGFRPHVARV 
29161 - TACTCCCAGTGTTATACCGCTACGACCGTACTGAATGCCGTCCATTTCTGCAACCAGCTC - 29220 

- Y SQCYTATTVLNAVHFCNQL 
-TPSVIPLRPY*MPSISATSS 

LPVLYRYDRTECRPFLQPAQ 
29221 - AACGACCTTGTGGCCGTGATTGGTGCTTAAGGCATCAGAACGTTTAATGAACACATAGGG - 2 9280 

- N DLVAVIGA * GI RTFNEHIG 
-TTLWP*LVLKASERLMNT*G 

RPCGRDWCLRHQNV* * THRA 
29281 - CTGTTCAAGCTGGGGCAGTACGCCTTTTTCCAGCTCTACTAGACCACAAGTGCCATTTTT - 29340 

- L FKLGQYAFFQLY* TTSAI F 
-CSSWGSTPFSSSTRPQVPFL 

VQAGAVRL FPALLDHKCHF* 
2 9341 - GAGGTGTTCACGTGCCTCCGATAGGGCCTCTTCCACAGAGTCCCCGAAGCCACGCACTAG - 2 94 00 

- E V F T C L R * GLFHRVPSATH* 
-RCSRASDRASSTESPKPRTS 

GVHVPPIGPLPQSPRSHALA 
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294 01 ~ CACGTCTCTAACCTGAAGGACAGGCAAACTGAGTTGGACGTGTGTTTTCTCGTTGACACC - 29460 
-HVSNLKDRQTELDVCFLVDT 
-TSLT*RTGKLSWTCVFSIiTP 
RL*PEGQAN*VGRVFSR*HQ 
29461 - AAGAACAAGGCTCTCCATCTTACCTTTCGGTCACACCCGGACGAAACCTAGGTATGCTGA - 29520 
-KNKALHLT F R S H P D E T * V C * 
-RTRLSILPFGHTRTKPRYAD 
EQGSPSYLSVTPGRNLGMLM 
2 9521 - TGATCGACTGCAACACGGACGAAACCGTAAGCAGTCTGCAGAAGAGGGACGAGTTACTCG - 2 9580 
- * STATRTKP *AVCRRGTSYS 
DRLQHGRNRKQSAEEGRVTR 
I DCNTDETVS SLQKRDELLV 
29581 - TTTCTTGTCAACGACAGTAAAATTTATTATTGTTTATACTGCGTAGGTGCACTAGGCATG - 29640 
-FLVNDSKI Y YCLYCVGALGM 
-FLSTTVKFI IVYTA*VH*AC 
SCQRQ*NLLLFI LRRCTRHA 
29641 - CAGCCGAGCGACAGCTACACAGATTTTAAAGTTCGTTTAGAGAACAGATCTACAAGAGAT - 29700 
-QPSDSYTDFKVRLENRSTRD 
-SRATATQI LKFV*RTDLQEI 
AERQliHRF *SSFREQIYKRS 
29701 - CGAGGTTGGTTGGCTTTTCCTGGGTAGGTAAAAACCTAATAT - 29742 
-RGWLAFPG * V K T * Y X 
-EVGWLFLGR*KPNX 
RLVGFSWVGKNLIX 
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PRIMER ANB PROBE SEQUENCES 

Forward Primer: 5 ' -CAG AACGCTGTAGCTTC AAA A ATCT -3 ' (SEQ ID NOS247 1) 
Reverse primer: 5 ' -TC AG A ACCCTGTG ATG A ATC A AC AG -3 ' (SEQ ID NO:2472) 
Probe: 5 '-TCTGCGTAGGCAATCC-3 ' (SEQ ID NO:2473) (5' labeled with 

FAM; 3' labeled with NFQ-MGB) 



Forward Primer: 5 ' -ACCAGAATGGAGG ACGCAATG-3 ' (SEQ ID NO;2474) 
Reverse primer: 5 '-GCTGTGAACCAAGACGCAGTATTAT -3 ' (SEQ ID NO:2475) 
Probe: 5'-ACCCCAAGGTTTACCC-3' (SEQ ID NO:2476) (5' labeled with 

FAM; 3' labeled with NFQ-MGB) 
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